COMPARATIVE ANALYSIS OF CHATGPT AND RE3DATA.ORG FOR FINDING DATA REPOSITORIES IN SOCIAL SCIENCE
DOI:
https://doi.org/10.62405/osi.2025.01.03Keywords:
ChatGPT, Re3Data.org, data repositories, social science, AIAbstract
Artificial intelligence (AI) is increasingly important in scholarly communication. Despite concerns about academic integrity compliance, AI tools offer potential benefits for researchers navigating the complex landscape of research data repositories. This study explores whether Chat Generative Pre-training Transformer (ChatGPT) can effectively identify and recommend quantitative and qualitative datasets in social sciences. We compare how ChatGPT (version 3.5) identifies data repositories versus the specialized Re3Data.org registry. The results revealed that ChatGPT can respond with relevant repository recommendations that complement rather than duplicate those found through Re3Data.org, providing researchers with a broader range of options. Standard searches using Re3Data.org offered more structured results with disciplinary categorization, while ChatGPT provided repositories with richer contextual information about their contents. In specialized searches for datasets on generative AI in academic contexts, ChatGPT demonstrated the ability to identify specific datasets across multiple repositories with detailed metadata. However, when asked about broader empirical trends, such as the proportion of quantitative versus qualitative research, ChatGPT could only provide generalized responses without precise statistics, highlighting its limitations in accessing current empirical data. The conclusion reached is that while ChatGPT cannot yet generate repository data of suitable quality for advanced-level analyses in all contexts, it is a valuable complementary tool to traditional repository registries. As AI tools continue to develop, educators and scholars must shift their focus from negative expectations to the practical benefits these tools can provide in research data discovery.
References
About | re3data.org. (2023). Home | re3data.org. https://www.re3data.org/about
About OECD iLibrary. (2023). OECD iLibrary. https://www.oecd-ilibrary.org/oecd/about
Assante, M., Candela, L., Castelli, D., & Tani, A. (2016). Are scientific data repositories coping with research data publishing? Data Science Journal, 15. https://doi.org/10.5334/dsj-2016-006
Corti, L., Woollard, M., Eynden, V. V. d., & Bishop, L. (2020). Managing and sharing research data: A guide to good practice (2 ed.). SAGE Publications, Limited.
Data.gov Home – Data.gov. (2023). Data.gov. https://data.gov/
Eurostat. (2023). Language selection | European Commission. https://ec.europa.eu/eurostat/en/#-main-content
Grouplens Datasets re3data.org. (2023). Home re3data.org. https://www.re3data.org/repository/r3d100012151
Harvard Dataverse. (2023). Harvard Dataverse. https://dataverse.harvard.edu/9. ICPSR. (2023). Institute for Social Research at the University of Michigan. https://www.icpsr.umich.edu/web/pages/
Introducing ChatGPT. (2022, November 30). OpenAI. https://openai.com/blog/chatgpt
Kindling, M., & Strecker, D. (2022). Data quality assurance at research data repositories. Data Science Journal, 21. https://doi.org/10.5334/dsj-2022-018
LibGuides: Statistics and datasets: UK data service. (2023). Home - LibGuides at University of Sussex. https://guides.lib.sussex.ac.uk/c.php?g=655580&p=4607014
openICPSR: Share your behavioral health and social science research data. (2023). openICPSR: Share your behavioral health and social science research data. https://www.openicpsr.org/openicpsr/
Piwowar, H.A., Day, R., & Fridsma, D.B. (2007). Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE, 2.
Repositorio Universidad Autónoma de Bucaramanga | re3data.org. (2023). Home | re3data.org. https://www.re3data.org/repository/r3d100013426
Social Scientific Research Documentation Centre Repository | re3data.org. (2023.). Home | re3data.org. https://www.re3data.org/repository/r3d100011132
UK Data Service. (2023). UK Data Service. https://ukdataservice.ac.uk/
World Bank. (2023). Data Catalog. https://datacatalog.worldbank.org/homeWeigand,
H., Johannesson, P., & Andersson, B. (2020). An Ontology of IS Design Science Research Artefacts (pp. 129–144). https://doi.org/10.1007/978-3-030-50316-1_8
Downloads
Published
Versions
- 2025-04-21 (3)
- 2025-04-17 (2)
- 2025-04-17 (1)