Skip to content

Concepts (ML, DA, DV)

The app steers analysis and built-in prompts using these areas. The AI is instructed to provide runnable Python or SQL for each step so you can execute with the Run button or in Jupyter.

  • Supervised learning: Regression (continuous targets), classification (discrete classes). Features and labels, train/test split, overfitting and underfitting.
  • Unsupervised learning: Clustering (K-Means, hierarchical, DBSCAN), dimensionality reduction (PCA, t-SNE, UMAP).
  • Evaluation: Classification (accuracy, precision, recall, F1, ROC-AUC); regression (MSE, RMSE, MAE, R²). Confusion matrix, learning curves.
  • Feature engineering: Selection and extraction, encoding categoricals, scaling, handling missing values, outlier detection.
  • Ensembles: Bagging, boosting, stacking. Random Forest, XGBoost, LightGBM.
  • Deep learning: ANN, CNN, RNN, Transformers (basics).
  • Descriptive: What happened? Summaries, aggregates, dashboards.
  • Diagnostic: Why did it happen? Drill-downs, correlations, root cause.
  • Predictive: What will happen? Forecasting, classification, risk.
  • Prescriptive: What should be done? Recommendations, optimization.
  • EDA: Summary statistics, distributions, correlations, outlier detection, hypothesis generation.
  • Statistics: Distributions, central tendency and dispersion, sampling, hypothesis testing, confidence intervals.
  • Techniques: Trend analysis, cohort analysis, funnel analysis, time series, forecasting, A/B testing, segmentation.
  • Principles: Accuracy, clarity, simplicity, context, accessibility.
  • Chart types: Line, bar, histogram, scatter, pie, heatmaps, box plots, violin plots, treemaps, network graphs.
  • Storytelling and dashboards: Narrative flow, KPI hierarchy, interactivity, filtering, drill-downs.
  • For ML: Feature importance, confusion matrices, ROC curves, residual plots, interpretability visuals.
  • Ethics: Avoid misleading charts, proper scaling, honest comparisons.
  1. Data collection
  2. Data cleaning and preprocessing
  3. Exploratory data analysis
  4. Feature engineering
  5. Model selection and training
  6. Evaluation and validation
  7. Visualization and reporting
  8. Deployment and monitoring

The app asks the AI to provide runnable code for each of these steps so you can run and adapt it in the notebook.