UMBC CMSC 471 02 Spring 2021 |
• home • about • schedule • hw • exams • notes • code • colab • github • resources • news • piazza • discord • webex • |
Homework EX1: Experimenting with Transformersout Tue 4/27, due Wed 5/21Assignment repository invitationThis is a simple extra credit homework that asks you to experiment with some pretrained GPT-2 models to generate text from initial prompts. The transformer model is a neural network model introduced in 2017 that is mostly used for text processing applications like machine translation, text summarization and question answering. GPT-2 and GPT-3 are large-scale unsupervised transformer language models developed by the company OpenAI that are trained to predict text that might follow an initial sequence of words. The current publicly available GPT-2 network has 1.5B parameters. It was trained on the WebText dataset, which has 40GB of text extracted from the contents of 45 million links posted by users of the ‘Reddit’ social network on or before 2017. Talk To Transformer is online demonstration by Adam Daniel King that was built with GPT-2 in mid 2019. It asks for the beginning of some text and automatically extends it with additional sentences. Write with a Transformer is another web-based demo supported by Hugging Face that uses GPT-2. You can easily experiment with the Hugging Face GPT-2 based system in Python using its Transformers package GPT-3 was released in the summer of 2020. It has a larger network with 175B parameters. Microsoft has recently licensed "exclusive" use of GPT-3 and control of its source code. Others can still use the public API to receive output after applying to get an API key, but its not easy to get one. A paper on GPT-3, Language Models are Few-Shot Learners, received a best-paper award from the prestigious NeurIPS conference just last week. There are many popular articles with examples of what GPT-3 can and cannot do and these will no doubt continue in the coming year. Haim is another interesting system that you can try online. It uses a model similar to GPT-2 that was trained with a public subset of OpenAI's WebText data. What's novel and interesting about Haim is that it uses what they call an "interpolating language model"
As an example, we gave Haim the first and last sentence of this paragraph and it filled in the rest.
What to doAfter reading the items linked from this page, experiment with the two systems.
For each produce a series of medium-length paragraphs on different three topics by providing an initial sentence (or two) and, for HAIM, a final sentence (or two). The topics should be varied and might including things like computers, AI, sports, politics, philosophy, or popular music. Edit the file TransformerExperiments.md file in your repository to record your results, using the the format of our example, and answering the questions at the end. For each of the two systems show at least three interesting examples for each topic, ranging from good to bad and provide a one sentence assessment of each. Feel free to do more than three topics and use more than three examples. Conclude with an overall assessment of a paragraph or two on Huggingface's Transformers and OpenAI's Haim. You might address questions like the following: Is it just an interesting example? Might it have utility as it is? What might make it better? Do you think a version trained on a different collection of text (e.g., cybersecurity reports) might be useful?
|
BIT.LY/471s21 • CMSC 471 02 Spring 2021 • CSEE • UMBC |