StarweaverStarweaver
  • Explore Courses
    • Collections

      • IT Management Frameworks
      • Live Online
      • Software Development
      • Finance Foundations – Capital Markets
      • Finance Foundations – Risk Management
      • Agile, Scrum, SAFe, Kanban…
      • Artificial Intelligence / Machine Learning
      • Business Domains
      • Big Data
      • Cloud
      • Certifications
      • Streaming Courses
  • Free Live Online
  • Get Certified
  • FOR BUSINESS
  • WEBINARS
    • Upcoming (Live)
    • Recorded (past)
  • Login?
    Explore Courses
    X
    • Streaming Courses
    • Live Online
    • Cloud
    • Artificial Intelligence / Machine Learning
    • Big Data
    • Agile, Scrum, SAFe, Kanban…
    • IT Management Frameworks
    • Software Development
    • Finance Foundations – Capital Markets
    • Finance Foundations – Corporate Finance
    • Finance Foundations – Risk Management
    • Certifications
    ⟵
    Python for Data Science & Machine Learning – Certification Boot Camp

    This Python for Data Science & Machine Learning Certification program covers how to use NumPy, Pandas, Seaborn, Matplotlib, Plotly, Scikit-L, and more in Machine Learning. Become a Python guru now!

    Learn More
    Immersion Certification in Continuous Integration and Development Tools

    This six (6) week training program combines live online, live in person and recorded content, instruction, labs, quizzes and tests to ensure delegates have a strong understanding.

    Learn More
    ⟵
    Streaming Courses View All
    Technology & Business
    • Business & Technical Writing Immersion
    • AWS Essentials
    • Cyber Security: Building a CyberWarrior Certification
    • Sales and Relationship Management
    Finance Related
    • Fundamental Financial Math
    • Capital Market Immersion
    • The Securities Trade Lifecycle
    • Commercial Credit Analysis
    ⟵
    Live Online View All
    Top Courses Now
    • Python for Data Science & Machine Learning – Certification Boot Camp
    • Azure Cloud Architect Immersion
    • Data Science & Machine Learning
    • Blockchain Business Consultant & Program Manager
    • AWS Certified Solutions Architect – Associate
    ⟵
    Cloud View All
    • Microservices Business Consultant & Program Manager Certification Program
    • Microservices Developer Certification Program
    • Understanding Kubernetes
    • Advanced Architecting on AWS
    • AWS Business Essentials
    • Big Data on AWS
    • Developing on AWS
    • DevOps Engineering on AWS
    ⟵
    Artificial Intelligence / Machine Learning View All
    Introduction/Intermediate
    • Understanding Machine Learning For Lawyers
    • Machine Learning Essentials
    • Intro to Deep Learning With TensorFlow
    • An Introduction to AI (Artificial Intelligence) and its applications
    Intermediate/Advanced
    • Data Science & Machine Learning – Developer Certification
    • DevOps Engineering on AWS
    • Machine Learning With Apache Spark
    • Machine Learning With Apache Spark
    • Machine Learning with Sagemaker (AWS)
    ⟵
    Big Data View All
    Dig into Data
    • Data Science and Big Data Analytics
    • Introduction to Python Programming
    • Data Analytics  With Python / Python for Data Scientists
    • Mastering Python
    ⟵
    Agile, Scrum, SAFe, Kanban… View All
    Certification Tracks
    • Certified ScrumMaster® (CSM)
    • Certified Scrum Product Owner® (CSPO)
    • SAFe 4.0 Scrum Master Orientation Training
    • Certified ScrumDeveloper® (CSD)
    Agile in Action
    • Introduction to Agile
    • Implementing Agile Test-Driven Development for Non-Programmers
    • Advanced Disciplined Agile Delivery
    • Collaborating and Communicating Agile Requirements
    ⟵
    IT Management Frameworks View All
    • TOGAF 9.1 Certified-Combined Program
    • Foundation Certificate Program – DevOps
    • ITIL ITSM 2011 Foundation Certification
    • Kanban Management Professional
    ⟵
    Software Development View All
    • Microservices Developer Certification Program
    • Data Science & Machine Learning – Developer Certification
    • Advanced Hadoop for Developers
    • Spark V2 For Developers
    ⟵
    Finance Foundations – Capital Markets View All
    Courses
    • Fundamental Financial Math
    • Yield Curve Building Blocks
    • Futures and Options Markets
    • Bonds with Options
    Full Curriculums
    • Capital Market Immersion
    • Capital Market Road Map
    • The Securities Trade Lifecycle
    • Market Risk Management and Capital Markets
    ⟵
    Finance Foundations – Corporate Finance View All
    Courses
    • Capital Asset Pricing Model
    • Principles of Credit Analysis
    • Corporate Financial Strategy & Capitalisation Alternatives
    • VBA Programming for Finance
    • Corporate Finance Modeling, Forecasting, Valuation and Capital Structure
    ⟵
    Finance Foundations – Risk Management View All
    Introduction/Intermediate
    • Financial Institutions and Risks
    • Credit Risk and Risk Management
    • Capital Markets: Products, Risks and Strategies
    Intermediate/Advanced
    • Financial Risk Management Essentials
    • Introduction to Credit Spreads and of the Management of Risk
    • Principles of Credit Analysis
    • Counterparty Credit Risk for Financial Institutions
    ⟵
    Professional View All
    Immersion Certification in Continuous Integration and Development Tools
    Azure Cloud Architect Immersion
    Data Science & Machine Learning Developer
    Blockchain Business Consultant and Program Manager
    • Explore Courses
      • Collections

        • IT Management Frameworks
        • Live Online
        • Software Development
        • Finance Foundations – Capital Markets
        • Finance Foundations – Risk Management
        • Agile, Scrum, SAFe, Kanban…
        • Artificial Intelligence / Machine Learning
        • Business Domains
        • Big Data
        • Cloud
        • Certifications
        • Streaming Courses
    • Free Live Online
    • Get Certified
    • FOR BUSINESS
    • WEBINARS
      • Upcoming (Live)
      • Recorded (past)
    • Login?

    Big Data

    • Home
    • All courses
    • Big Data
    • Designing and Building Big Data Applications
    Home / LP Courses / Cutting Edge IT / MEAN Stack / Designing and Building Big Data Applications
    learn for good program

    Designing and Building Big Data Applications

    bigdata2
    Enquire

    Course Features

    • Duration: 32 hours
    • Skill Level: All level
    • Language: English

    Enrolled You have 10 weeks remaining for the course

    • Overview
    • Curriculum

    DESCRIPTION

    This four day training for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH).

    You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results to the end-user in an easy-to-digest form. Go beyond MapReduce to use additional elements of the EDH and develop converged applications that are highly relevant to the business.

    TARGET AUDIENCE

    • This course is best suited to developers, engineers, and architects who want to use use Hadoop and related tools to solve real-world problems. Participants should have already attended Cloudera Developer Training for Apache Hadoop or have equivalent practical experience. Good knowledge of Java and basic familiarity with Linux are required. Experience with SQL is helpful

    OBJECTIVES

    At the end of the course, students will be able to:

    • Creating a data set with Kite SDK
    • Developing custom Flume components for data ingestion
    • Managing a multi-stage workflow with Oozie
    • Analyzing data with Crunch
    • Writing user-defined functions for Hive and Impala
    • Transforming data with Morphlines
    • Indexing data with Cloudera Search
    1) Application Architecture
    • Scenario Explanation
    • Understanding the Development Environment
    • Identifying and Collecting Input Data
    • Selecting Tools for Data Processing and Analysis
    • Presenting Results to the Use
    2) Defining and Using Data Sets
    • Metadata Management
    • What is Apache Avro?
    • Avro Schemas
    • Avro Schema Evolution
    • Selecting a File Format
    • Performance Considerations
    3) Using the Kite SDK Data Module
    • What is the Kite SDK?
    • Fundamental Data Module Concepts
    • Creating New Data Sets Using the Kite SDK
    • Loading, Accessing, and Deleting a Data Set
    4) Importing Relational Data with Apache Sqoop
    • What is Apache Sqoop?
    • Basic Imports
    • Limiting Results
    • Improving Sqoop--s Performance
    • Sqoop 2
    5) Capturing Data with Apache Flume
    • What is Apache Flume?
    • Basic Flume Architecture
    • Flume Sources
    • Flume Sinks
    • Flume Configuration
    • Logging Application Events to Hadoop
    6) Developing Custom Flume Components
    • Flume Data Flow and Common Extension Points
    • Custom Flume Sources
    • Developing a Flume Pollable Source
    • Developing a Flume Event-Driven Source
    • Custom Flume Interceptors
    • Developing a Header-Modifying Flume Interceptor
    • Developing a Filtering Flume Interceptor
    • Writing Avro Objects with a Custom Flume Interceptor
    7) Managing Workflows with Apache Oozie
    • The Need for Workflow Management
    • What is Apache Oozie?
    • Defining an Oozie Workflow
    • Validation, Packaging, and Deployment
    • Running and Tracking Workflows Using the CLI
    • Hue UI for Oozie
    8) Processing Data Pipelines with Apache Crunch
    • What is Apache Crunch?
    • Understanding the Crunch Pipeline
    • Comparing Crunch to Java MapReduce
    • Working with Crunch Projects
    • Reading and Writing Data in Crunch
    • Data Collection API Functions
    • Utility Classes in the Crunch API
    9) Working with Tables in Apache Hive
    • What is Apache Hive?
    • Accessing Hive
    • Basic Query Syntax
    • Creating and Populating Hive Tables
    • How Hive Reads Data
    • Using the RegexSerDe in Hive
    10) Developing User-Defined Functions
    • What are User-Defined Functions?
    • Implementing a User-Defined Function
    • Deploying Custom Libraries in Hive
    • Registering a User-Defined Function in Hive
    11) Executing Interactive Queries with Impala
    • What is Impala?
    • Comparing Hive to Impala
    • Running Queries in Impala
    • Support for User-Defined Functions
    • Data and Metadata Management
    12) Understanding Cloudera Search
    • What is Cloudera Search?
    • Search Architecture
    • Supported Document Formats
    13) Indexing Data with Cloudera Search
    • Collection and Schema Management
    • Morphlines
    • Indexing Data in Batch Mode
    • Indexing Data in Near Real Time
    14) Presenting Results to Users
    • Solr Query Syntax
    • Building a Search UI with Hue
    • Accessing Impala through JDBC
    • Powering a Custom Web Application with Impala and Search

    You May Like

    Python for Data Science & Machine Learning – Certification Boot Camp Read More
    techsupport

    Python for Data Science & Machine Learning - Certification Boot Camp

    Enquire
    Enquire

    Python for Data Science & Machine Learning – Certification Boot Camp

    Artificial Intelligence & Data Science, Live Now!

    50 hours ♦ All levels

    DESCRIPTIONThis four day training for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH).You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results

    More DetailsEnquire Now
    Immersion Certification in Continuous Integration and Development Tools Read More
    techsupport

    Immersion Certification in Continuous Integration and Development Tools

    Enquire
    Enquire

    Immersion Certification in Continuous Integration and Development Tools

    Certifications

    50 hours ♦ All levels

    DESCRIPTIONThis four day training for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH).You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results

    More DetailsEnquire Now
    Azure Cloud Architect Immersion Certification Program Read More
    Paul Siegel

    Azure Cloud Architect Immersion Certification Program

    Enquire
    Enquire

    Azure Cloud Architect Immersion Certification Program

    Certifications, Cloud, Learn Now, Live Now!

    60 hours ♦ All levels

    DESCRIPTIONThis four day training for designing and building Big Data applications prepares you to analyze and solve real-world problems using Apache Hadoop and associated tools in the enterprise data hub (EDH).You will work through the entire process of designing and building solutions, including ingesting data, determining the appropriate file format for storage, processing the stored data, and presenting the results

    More DetailsEnquire Now

    Leave A Reply Cancel reply

    You must be logged in to post a comment.

    Latest Courses

    Blockchain and Machine Learning

    Blockchain and Machine Learning

    Blockchain and Machine Learning

    Blockchain, Cutting Edge IT, Machine Learning

    3 Days ♦ All levels

    Blockchain is a forge-proof distributed database, and Machine Learning is a popular technology that allows computers to understand and learn from data. Combined, then give rise to a new class of applications. In this class, the students learn about the technologies and implementation of combined Blockchain + Machine Learning use cases.Goals ● Get a solid foundation in Blockchain, Bitcoin, Ethereum, Hyperledger ●

    More DetailsEnquire Now
    Introduction to Python Programming

    Introduction to Python Programming

    Introduction to Python Programming

    App Development, Big Data, Cutting Edge IT, Data Analysis, Databases, Programming Languages

    24 hours ♦ All Levels

    DESCRIPTION Python has been around for decades, but it's still one of the most versatile and popular programming languages out there. Whether you're relatively new to programming or have been developing software for years, Python is an excellent language to add to your skill set. In this course, you'll learn the fundamentals of programming in Python, and you'll develop applications to

    More DetailsEnquire Now
    Blockchain Business Applications (4 hours)

    Blockchain Business Applications (4 hours)

    $97.50$124.95

    Blockchain Business Applications (4 hours)

    Blockchain, Cloud, Cutting Edge IT, Data Analysis, Machine Learning

    4 Hours ♦ All levels

    Description    This course is for IT consultants and business staff familiar with blockchain basics, who want to know how to apply blockchain in business functions. The course provides a solid foundation and understanding of blockchain technology (including its principles and fundamental operations) as well as a solid understanding the many growing applications of blockchain in business.  The course is

    More Details

    $124.95$97.50

    starweaver-logo-transparent

    795 Folsom Street, San Francisco, California 94107 || +1-415-483-2260 // +44 20 3289 3277

    COMPANY

    About Us
    Jobs & Careers
    Help/Support
    Policies and Terms
    Contact

    PARTNER WITH US

    Instructors & Teachers
    Channel Partners/Affiliates
    Writing and Publishing

     

    COMMUNITY

    Slack Channel
    Alumni

    FOR BUSINESS

    What Customers Say
    Private Classes
    Learning Paths
    Competency Frameworks

    Follow us

    Education you can bank on® ||  People are your most important assets!® || People are the only real asset!®

    © Starweaver Group, Inc. All Rights Reserved.

    Welcome to Starweaver!

    Stay informed and sharpen your skills!

    This website uses cookies, including third party ones, to allow for analysis of how people use our website in order to improve your experience and our services. By continuing to use our website, you agree to the use of such cookies. Click here for more information on our Privacy Policy.

    More information
    Privacy SettingsCookies

    Privacy Settings

    This website uses cookies, including third party ones, to allow for analysis of how people use our website in order to improve your experience and our services. By continuing to use our website, you agree to the use of such cookies. Click here for more information on our Privacy Policy. You may change your settings at any time. Your choices will not impact your visit.

    NOTE: These settings will only apply to the browser and device you are currently using.

    Cookies

    This website uses cookies, including third party ones, to allow for analysis of how people use our website in order to improve your experience and our services. By continuing to use our website, you agree to the use of such cookies.

    Accept