Digital Transformation : Advanced Big Data
irna
Fri, 02/17/2023 - 03:35
Duration
2 Days
Exam Details
This course is part of the following certification track(s):
- Digital Transformation Data Scientist
Price
--- Call Us ---
This course provides an in-depth overview of essential and advanced topic areas pertaining to data science and analysis techniques relevant and unique to Big Data with an emphasis on how analysis and analytics need to be carried out individually and collectively in support of the distinct characteristics, requirements and challenges associated with Big Data datasets.
Key Outcomes:
Student will be able to know:
- Probability, Frequency, Statistical Estimators, Confidence Interval, etc
- Basic Mathematical Notations
- Data Discretization, Binning and Clustering
- Numerical Summaries, Modeling, Model Evaluation, Model Fitting and Model Overfitting
- Association Rules and Apriori Algorithm
- Decision Trees for Big Data
- Time Series Analysis, Trend, Seasonality, K Nearest Neighbor (kNN), K-means
Objectives
- Exploratory Data Analysis, Essential Statistics, including Variable Categories and Relevant Mathematics
- Statistics Analysis, including Descriptive, Inferential, Covariance, Hypothesis Testing, etc.
- Measures of Variation or Dispersion, Interquartile Range & Outliers, Z-Score, etc.
- Probability, Frequency, Statistical Estimators, Confidence Interval, etc.
- Variables and Basic Mathematical Notations, Statistical Measures and Statistical Inference
- Confirmatory Data Analysis (CDA)
- Data Discretization, Binning and Clustering
- Visualization Techniques, including Bar Graph, Line Graph, Histogram, Frequency Polygons, etc.
- Prediction Linear Regression, Mean Squared Error and Coefficient of Determination R2, etc.
- Numerical Summaries, Modeling, Model Evaluation, Model Fitting and Model Overfitting
- Statistical Models, Model Evaluation Measures
- Cross-Validation, Bias-Variance, Confusion Matrix and F-Score
- Association Rules and Apriori Algorithm
- Data Reduction, Dimensionality Feature Selection
- Feature Extraction, Data Discretization (Binning and Clustering)
- Parametric vs. Non-Parametric, Clustering vs. Non-Clustering
- Distance-Based, Supervised vs. Semi-Supervised
- Linear Regression and Logistic Regression for Big Data
- Logistics Regression, Naïve Bayes, Laplace Smoothing, etc.
- Decision Trees for Big Data
- Pattern Identification, Association Rules, Apriori Algorithm
- Time Series Analysis, Trend, Seasonality, K Nearest Neighbor (kNN), K-means
- Text Analytics for Big Data and Outlier Detection for Big Data
- Statistical, Distance-Based, Supervised and Semi-Supervised Techniques
Contact
- Nurman (+62 857-2375-3840)
- Irna (+62 822-1664-7749)
- Rakhmat (+62 813-2149-6020)
- Puji (+62 813-2424-2115)
- Alifa (+62 822-1556-8920)