Using large language models and generative AI in research

Emerging large language model (LLM) and artificial intelligence (AI) technologies, such as ChatGPT, have impacts on UBC’s research environment. This page provides a brief overview of various opportunities and challenges associated with using these tools in research.

This page will be updated frequently as the VPRI portfolio engages with the UBC research community and others to address these opportunities and challenges. Please check back regularly for updates.

The VPRI portfolio will also be offering upcoming workshops and forums for discussion.

We invite you to get in touch via research.innovation@ubc.ca if there are topics you would like to see addressed on this page.

 


ROLES & USES OF LLM in research

The use of LLM and AI, including in the research environment, is an evolving area of scholarship in itself. It incorporates many different aspects that may relate, for example, to ethics, technical developments and modelling, and has many areas of application.

In addition to this area of scholarship, LLM models and AI tools could impact the following broad areas of research projects and programs:

  • grant applications;
  • conducting research;
  • research publications; and
  • award nominations.

They may be most commonly used to assist with generating content in the following activities:

  • drafting content
  • structuring content
  • summarizing existing knowledge and content
  • generating figures and images
  • generating code
  • some hypothesis testing

Challenges

The use of these models can pose a number of challenges for researchers.

Data Ownership

  • Privacy & user data
    • Who owns what you submit? Issues may relate to submitting user data as well as information you input into the models. This poses a need to fully understand privacy policy of the tool
  • Copyright infringement
    • Internet scrapes may include copyrighted information – text and images – and cause copyright infringement if the text is reproduced.

Authorship and Plagiarism

  • There is uncertainty of authorship status if text is generated by these models – can you claim authorship?
  • As well as reproducing copyrighted content, does generated content cross thresholds relating to plagiarism?

Inventorship

  • Public Disclosure:
    • Depending on data ownership and confidentiality agreements around tools, does submitting potential intellectual property to systems like ChatGPT count as a public disclosure and invalidate any future patent applications?
    • Will your inputted data and concepts be added to language models and be shared more broadly?
  • Attribution
    • If AI-generated content ends up in patent claims, who is the inventor? It appears that case law is still pending on this matter.
    • The USPTO has taken the position that works created by an AI without human intervention are not copyrightable.

Data Validity

  • Accuracy & Due Diligence
    • What confidence is there that the generated responses are accurate? Many instances have been noted where models generate factual inaccuracies and fabricated citations. Two lawyers in New York were fined for submitting legal case precedents generated by AI and later found to be fabricated.
  • Reproducibility
    • Without transparency around how content is generated, how confident can one be that data, data displays or visualizations can be reproduced?
  • Veracity
    • Is data generated by LLM/AI something that can be trusted? If testing hypotheses, can any AI-generated results be used in any way in your research?
  • Bias
    • LLMs reflect the data they access and so reproduce any biases associated with data sources.

Publications

  • Citations
  • Restrictions
    • Publications, such as Nature, are starting to generate guidelines for acknowledging legitimate use of these models

Access

What needs to be considered when accessing LLM and Ai tools?

  • Evaluation
    • The LibrAIry has designed the ROBOT test as a tool for evaluating AI. The ROBOT acronym—reliability, objective, bias, ownership, and type-- can help you remember important criteria that can be used to evaluate a new and unknown AI tool
  • Institutional accounts/access
    • What roles can institutions play in providing or guiding access to LLM tools?

UBC Policies, Scholarly Integrity & Responsible conduct of research

Questions around how use of LLM and AI models relate to institutional policies and the responsible conduct of research continue to emerge. 


UBC Resources