1. What does ETL stand for in the context of data science?
2. What is the purpose of data validation in the ETL process?
3. Which of the following describes structured data?
4. What is a dataset?
5. What is an example of a first-party data source?
6. What is a key advantage of public APIs in data collection?
7. Which of the following is a third-party data source?
8. What is one benefit of using generated data in data science?
9. What is the purpose of data transformation in the ETL process?
10. What does deduplication in data science refer to?
11. What is a benefit of using word embedding techniques in data science?
12. Which of the following is an example of a quantitative feature in a dataset?
13. What does the "range" of a quantitative dataset represent?
14. How is continuous data different from discrete data?
15. Why is data parsing important in the ETL process?
16. What is the key purpose of feature scaling in data preparation?
17. What type of error is commonly corrected during the data-cleaning phase?
18. Which tool is most commonly used to visualize data for non-practitioners?
19. Why is deduplication important in data science?
20. What is the main challenge of loading large volumes of data into databases?