Skip to main content

Data Analysis with Python

ABOUT THE COURSE!

As part of our Certificate Program in Data Science, this course aims at providing a fundamental understanding on using SQL to access data, data visualization and exploratory data analysis.

Data analysis is a process for obtaining raw data and converting it into information useful for decision-making by users. The data are necessary as inputs to the analysis, much of the world’s data resides in database. Extracting data from database using SQL is an important technique you must learn if you want to become a data scientist. Data initially obtained must be processed or organised and clean for analysis. This course engages you with knowledge about data wrangling and data exploration by using Python’s library such as Numpy or Pandas. One of the key skills of a data scientist is the ability to tell a compelling story, visualizing data and findings in an approachable and stimulating way. Various techniques have been taught for presenting data visually.

Moreover, this course will provide learners with chances to practice skills required to leverage data for revealing valuable insights and advancing their career.

To begin the course, let's take a few minutes to explore the course site. Review the material we’ll cover each week, and preview the assignments/projects/quizzes you’ll need to complete to pass the course.

Main concepts are delivered through videos, demos and hands-on exercises.

COURSE INFORMATION

Course code: DSP302x
Course name: Data Analysis with Python
Credits: 3
Estimated Time: 6 weeks. Student should allocate at average of 2 hours/ day to complete the course.

COURSE OBJECTIVES

  • Comprehends the basics of SQL, applies SQL statement to query data
  • Practices with advanced SQL statements
  • Use Python to access to a database
  • Comprehends Data Analysis, Applies Python package to imCGrt and exCGrt data
  • Explains some problems in data and practices technique to handle missing value, normalize data
  • Practices descriptive statistical and explains various correlation statistical methods
  • Explains and practices model development, evaluation and turning model
  • Comprehends Data Visualization and practices with Matplotlib
  • Uses Seaborn to create plots
  • Applies advanced data visualization

COURSE STRUCTURE

Module 1:  Databases and SQL for Data Science

  • Lesson 1: Introduction to Databases
  • Lesson 2: Basic SQL
  • Lesson 3: String Patterns, Ranges, Sorting, and Grouping
  • Lesson 4: Functions, Sub-Queries, Multiple Tables
  • Lesson 5: Accessing databases using Python

Assignment 1: Real-world database project using SQL statement

Module 2: Data Analysis

  • Lesson 6: Importing and Exporting Datasets
  • Lesson 7: Data WRANGLING
  • Lesson 8:  Exploratory Data Analysis
  • Lesson 9: Model Development
  • Lesson 10: Model Evaluation and Refinement

Module 3:  Data Visualization with Python

  • Lesson 11:  Introduction to Data Visualization
  • Lesson 12: Basic Visualization Tools
  • Lesson 13: Specialized Visualization Tools
  • Lesson 14: Advanced Visualization Tools
  • Lesson 15: Visualizing Geospatial Data
  • Lesson 16: Interactive Visualization with Plotly

Assignment 2: NYC taxi analysis project

DEVELOPMENT TEAM

COURSE DESIGNERS

M.S. Vu Thuong Huyen

  • Data Scientist at FPT Software Company Limited – a subsidiary of FPT Corporation
  • Master of Software engineering, VNU University of Engineering and Technology
  • Bachelor of Engineering, School of Applied Mathematics and Informatics, Hanoi University of Science and Technology
  • Research fields: Machine learning, Deep learning, Reinforcement Learning, Natural Language Processing…
  • Profile online: https://www.linkedin.com/in/thuong-huyen-3969747a/ 

REVIEWERS & TESTER

Course Reviewer

 

 

Course Tester

 

 

Ph.D. Dang Hoang Vu

  • FPT Science Director
  • Ph.D. in Mathematics, University of Cambridge
  • Core member of R&D activities in FPT Corporation
  • Main responsibility in analytics side of FPT’s Data Management Platform and data science research

M.Sc. Nguyen Cong Thanh

  • Data Analytics staff at FE Credit
  • Master student, School of Quantitative and Computational Finance, John Von Neumann Institute
  • Bachelor of Financial Mathematics, School of Mathematics and Statistics, University of Economics HCM City.
  • Research fields: Machine learning,
  • Profile online: https://www.linkedin.com/in/thanh-nguyen-cong-04b24a100/

 Program Reviewers

 Assoc. Prof. Tu Minh Phuong

Ph.D. Nguyen Van Vinh

Ph.D. Tran The Trung

  • Dean of IT Faculty, Posts and
    Telecommunications Institute of Technology (PTIT)
  • Expert & technological consultant in AI & machine learning
  • Head of Machine Learning &
    Application laboratory in PTIT
  • Lecturer & core member of AI Lab, University of Technology - VNU
  • AI expert & consultant for DPS, Fsoft
  • Ph.D. in Computer Science, Japan Advanced Institute of Science &
    Technology
  • Bachelor’s degree in IT, University of Science, VNU
  • Director of FPT Technology
    Research Institute, FPT University
  • Ph.D. in Computational Physics, UVSQ Université de Versailles Saint-Quentin-en-Yvelines
  • M.S. in Astrophysics, Pierre & Marie Curie University
  • B.S. in Theoretical & Mathematical Physics, University of Melbourne

MOOC MATERIALS

Below is the list of all free massive open online learning sources (MOOC) from Coursera used for this course by FUNiX: 


Learning resources

In modern times, each subject has numerous relevant studying materials including printed and online books. FUNiX Way does not provide a specific learning resource but offers recommendation for students to choose the most appropriate source to them. In the process of studying from many different sources based on that personal choice, students will be timely connected to a mentor to respond to their questions. All the assessments including multiple choice questions, exercises, projects and oral exams are designed, developed and conducted by FUNiX.  

Learners are under no obligation to choose a fixed learning material. They are encouraged to actively find and study from any appropriate sources including printed textbooks, MOOCs or websites. Students are on their own responsibilities in using these learning sources and ensuring full compliance with the source owners’ policies; except for the case in which they have an official cooperation with FUNiX. For further support, feel free to contact FUNiX Academic Department for detailed instructions. 

Learning resources are recommended below. It should be noted that listing these learning sources does not necessarily imply that FUNiX has an official partnership with the source’s owner: CourseratutorialspointedX Training, or Udemy.


 Feedback channel

FUNiX is ready to receive and discuss all comments and feedback related to learning materials via email [email protected]

Enroll