Data Science Course Objectives
Each data science course offered through the Undergraduate Certificate in Data Science, includes various educational objectives for students, outlined below. For educators interested in articulation, please review the list of objectives below and contact us, at data@ku.edu, with any questions.
Data 1: Introduction to Data Science
- Exposure to different types of data sets (e.g., census, experimental)
- Identify questions that are answerable from the data available
- Calculate new variables from existing variables
- Exposure to importing and joining relational data
- Statistical Foundations
- Formulate null and alternative hypotheses
- Construct a sampling distribution
- Calculate observed test statistics
- Derive p-values and confidence intervals
- Connecting different variable types to types of visualization
- Basic exploratory data analysis and interpretation
- Learn about basic data structures (e.g., arrays, vectors, tables)
- Responsible data management (e.g., transparency, reproducibility)
Data 2: Intermediate Data Science
- Data Wrangling
- Row operations: filtering, sorting, grouping, summarization
- Column operations: selecting, calculations
- Reshaping data
- Joining relational data
- Programming
- Custom functions
- Iteration
- File handling
- Visualization principles and techniques
- Communication principles and ethics
- Team-based collaboration and communication
- Creation of a data science portfolio
Data 3: Data Management
- Database design and management
- Structured querying (e.g., with SQL)
- Version control (e.g., git)
- String manipulation (regular expressions)
- Basics of web scraping
- Big data (larger than memory)
- Web services
- Software as a service
- Parallel computing
Data 4: Introduction to Machine and Statistical Learning
- Linear regression and classification
- Resampling methods (bootstrapping, cross-validation)
- Linear model selection and regularization (lasso, ridge)
- Nonlinear regression (polynomial, local, spline, step, additive)
- Tree-based methods (decision trees, random forests)
- Unsupervised learning (k-means, principal components)
- Data bias and ethical practices in modeling
Capstone / Community Data Labs
- Define a problem statement given a dataset and constraints
- Communicate with a community partner / real-world client
- Apply data science skills to a capstone project
- Contribute project work to student portfolio