Using large language models and generative AI in research

Emerging large language model (LLM) and artificial intelligence (AI) technologies, such as ChatGPT, have impacts on UBC’s research environment. This page provides a brief overview of various opportunities and challenges associated with using these tools in research.

This page will be updated frequently as the VPRI portfolio engages with the UBC research community and others to address these opportunities and challenges. Please check back regularly for updates.

The VPRI portfolio will also be offering upcoming workshops and forums for discussion.

We invite you to get in touch via research.innovation@ubc.ca if there are topics you would like to see addressed on this page.

Roles and Uses of LLM in Research

The use of LLM and AI, including in the research environment, is an evolving area of scholarship in itself. It incorporates many different aspects that may relate, for example, to ethics, technical developments and modelling, and has many areas of application.

In addition to this area of scholarship, LLM models and AI tools could impact the following broad areas of research projects and programs:

grant applications;
conducting research;
research publications; and
award nominations.

They may be most commonly used to assist with generating content in the following activities:

drafting content
structuring content
summarizing existing knowledge and content
generating figures and images
generating code
some hypothesis testing

Challenges

The use of these models can pose a number of challenges for researchers.

Data Ownership

Privacy & user data
- Who owns what you submit? Issues may relate to submitting user data as well as information you input into the models. This poses a need to fully understand privacy policy of the tool
Copyright infringement
- Internet scrapes may include copyrighted information – text and images – and cause copyright infringement if the text is reproduced.

Authorship and Plagiarism

There is uncertainty of authorship status if text is generated by these models – can you claim authorship?
As well as reproducing copyrighted content, does generated content cross thresholds relating to plagiarism?

Inventorship

Public Disclosure:
- Depending on data ownership and confidentiality agreements around tools, does submitting potential intellectual property to systems like ChatGPT count as a public disclosure and invalidate any future patent applications?
- Will your inputted data and concepts be added to language models and be shared more broadly?
Attribution
- If AI-generated content ends up in patent claims, who is the inventor? It appears that case law is still pending on this matter.
- The USPTO has taken the position that works created by an AI without human intervention are not copyrightable.

Data Validity

Accuracy & Due Diligence
- What confidence is there that the generated responses are accurate? Many instances have been noted where models generate factual inaccuracies and fabricated citations. Two lawyers in New York were fined for submitting legal case precedents generated by AI and later found to be fabricated.
Reproducibility
- Without transparency around how content is generated, how confident can one be that data, data displays or visualizations can be reproduced?
Veracity
- Is data generated by LLM/AI something that can be trusted? If testing hypotheses, can any AI-generated results be used in any way in your research?
Bias
- LLMs reflect the data they access and so reproduce any biases associated with data sources.

Publications

Citations
- How to cite use of AI is an emerging field for journals and publications. See the UBC Library overview at https://guides.library.ubc.ca/GenAI/cite
Restrictions
- Publications, such as Nature, are starting to generate guidelines for acknowledging legitimate use of these models

Access

What needs to be considered when accessing LLM and AI tools?

Evaluation
- The LibrAIry has designed the ROBOT test as a tool for evaluating AI. The ROBOT acronym—reliability, objective, bias, ownership, and type-- can help you remember important criteria that can be used to evaluate a new and unknown AI tool
Institutional accounts/access
- What roles can institutions play in providing or guiding access to LLM tools?

UBC Policies, Scholarly Integrity & Responsible Conduct of Research

Questions around how use of LLM and AI models relate to institutional policies and the responsible conduct of research continue to emerge.

UBC Resources

Recording: Pitfalls and Potential: Integrating Generative AI into Your Research and Scholarship - a conversation with UBC's Drs. Jeff Clune and Gail Murphy.
Guidance on using generative AI at UBC: Access resources aimed to support faculty, staff and students in responsible generative AI for teaching and learning. Site also inlcudes events, news, updates and expert insights.
UBC Library: Best practices for using generative AI tools
UBC Library: Researching with generative AI
UBC Library: Generative AI and ChatGPT
Scholarly Integrity Initiative

Using large language models and generative AI in research

Updates

Roles and Uses of LLM in Research

Challenges

Data Ownership

Authorship and Plagiarism

Inventorship

Data Validity

Publications

Access

UBC Policies, Scholarly Integrity & Responsible Conduct of Research

UBC Resources

Quick Access

Researcher Resources

Portfolio Links

Office of the Vice-President, Research and Innovation

About UBC

UBC Campuses

UBC Sites

Using large language models and generative AI in research

Updates

Roles and Uses of LLM in Research

Challenges

Data Ownership

Authorship and Plagiarism

Inventorship

Data Validity

Publications

Access

UBC Policies, Scholarly Integrity & Responsible Conduct of Research

UBC Resources

First Nations land acknowledegement

Quick Access

Researcher Resources

Portfolio Links

Office of the Vice-President, Research and Innovation