Skip to content

Commit e1d9677

Browse files
committed
readme update
1 parent c33ed1e commit e1d9677

File tree

1 file changed

+40
-9
lines changed

1 file changed

+40
-9
lines changed

README.md

Lines changed: 40 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,33 @@
11
# Transformer Models for MATLAB
2-
[![CircleCI](https://img.shields.io/circleci/build/github/matlab-deep-learning/transformer-models?label=tests)](https://app.circleci.com/pipelines/github/matlab-deep-learning/transformer-models)
3-
42
This repository implements deep learning transformer models in MATLAB.
53

64
## Requirements
7-
* MATLAB R2020a or later
5+
* MATLAB R2020a or later for GPT-2
6+
* MATLAB R2021a or later for BERT and FinBERT
87
* Deep Learning Toolbox
8+
* Text Analytics Toolbox for BERT and FinBERT
99

1010
## Getting Started
1111
Download or [clone](https://www.mathworks.com/help/matlab/matlab_prog/use-source-control-with-projects.html#mw_4cc18625-9e78-4586-9cc4-66e191ae1c2c) this repository to your machine and open it in MATLAB.
1212

1313
## Functions
14-
### gpt2
14+
### [bert](./bert.m)
15+
`mdl = bert` loads a pretrained BERT transformer model and if necessary, downloads the model weights. The `mdl` struct has fields `Tokenizer` containing the BERT tokenizer and `Parameters` to be passed to `bert.model(x,mdl.Parameters)` where `x` can be `seq{1}` where `seq = mdl.Tokenizer.encode("hello world!");`
16+
17+
`mdl = bert('Model',modelName)` specifies an optional model. All models besides `"multilingual-cased"` are case-insensitve. The choices for `modelName` are:
18+
* `"base"` (default) - A 12 layer model with hidden size 768.
19+
* `"multilingual-cased"` - A 12 layer model with hidden size 768. The tokenizer is case-sensitive. This model was trained on multi-lingual data.
20+
* `"medium"` - An 8 layer model with hidden size 512.
21+
* `"small"` - A 4 layer model with hidden size 512.
22+
* `"mini"` - A 4 layer model with hidden size 256.
23+
* `"tiny"` - A 2 layer model with hidden size 128.
24+
25+
The model parameters match those found on the [original BERT repo](https://github.com/google-research/bert/). The BERT-Base parameters are from the original release, not the update from where the smaller models are sourced.
26+
27+
### [gpt2](./gpt2.m)
1528
`mdl = gpt2` loads a pretrained GPT-2 transformer model and if necessary, downloads the model weights.
1629

17-
### generateSummary
30+
### [generateSummary](./generateSummary.m)
1831
`summary = generateSummary(mdl,text)` generates a summary of the string or `char` array `text` using the transformer model `mdl`. The output summary is a char array.
1932

2033
`summary = generateSummary(mdl,text,Name,Value)` specifies additional options using one or more name-value pairs.
@@ -24,15 +37,27 @@ Download or [clone](https://www.mathworks.com/help/matlab/matlab_prog/use-source
2437
* `'Temperature'` - Temperature applied to the GPT-2 output probability distribution. The default is 1.
2538
* `'StopCharacter'` - Character to indicate that the summary is complete. The default is `'.'`.
2639

40+
### [finbert](./finbert.m)
41+
`mdl = finbert` loads a pretrained and fine-tuned BERT transformer model for classifying sentiment of financial text. The `mdl` struct is similar to the BERT model struct with additional weights for the sentiment classifier head. The sentiment analysis functionaliy is accessed through `[sentimentClass,sentimentScore] = finbert.sentimentModel(x,mdl.Parameters)` where `seq = mdl.Tokenizer.encode("The FTSE100 suffers dramatic losses on the back of the pandemic."); x = dlarray(seq{1});`.
42+
43+
`mdl = finbert('Model',modelName)` specifies an optional model from the choices:
44+
* `"sentiment-model"` - The fine-tuned sentiment classifier model.
45+
* `"language-model"` - The FinBERT pre-trained language model, which uses a BERT-Base architecture.
46+
47+
The parameters match those found on the [original FinBERT repo](https://github.com/ProsusAI/finBERT).
48+
49+
## Example: Classify Text Data Using BERT
50+
The example [`ClassifyTextDataUsingBERT.m`](./ClassifyTextDataUsingBERT.m) is [this existing example](https://www.mathworks.com/help/textanalytics/ug/classify-text-data-using-deep-learning.html) reinterpreted to use BERT as an embedding.
51+
2752
## Example: Summarize Text Using GPT-2
28-
The example `SummarizeTextUsingTransformersExample.m` shows how to summarize a piece of text using GPT-2.
53+
The example [`SummarizeTextUsingTransformersExample.m`](./SummarizeTextUsingTransformersExample.m) shows how to summarize a piece of text using GPT-2.
2954

3055
Transformer networks such as GPT-2 can be used to summarize a piece of text. The trained GPT-2 transformer can generate text given an initial sequence of words as input. The model was trained on comments left on various web pages and internet forums.
3156

3257
Because lots of these comments themselves contain a summary indicated by the statement "TL;DR" (Too long, didn't read), you can use the transformer model to generate a summary by appending "TL;DR" to the input text. The `generateSummary` function takes the input text, automatically appends the string `"TL;DR"` and generates the summary.
3358

3459
### Load Transformer Model
35-
Load the GPT-2 transformer model using the `gpt2` function.
60+
Load the GPT-2 transformer model using the [`gpt2`](./gpt2.m) function.
3661

3762
```matlab:Code
3863
mdl = gpt2;
@@ -46,7 +71,7 @@ inputText = help('eigs');
4671
```
4772

4873
### Generate Summary
49-
Summarize the text using the `generateSummary` function.
74+
Summarize the text using the [`generateSummary`](./generateSummary.m) function.
5075

5176
```matlab:Code
5277
rng('default')
@@ -57,4 +82,10 @@ summary = generateSummary(mdl,inputText)
5782
summary =
5883
5984
' EIGS(AFUN,N,FLAG) returns a vector of AFUN's n smallest magnitude eigenvalues'
60-
```
85+
```
86+
87+
## Example: Classify Sentiment with FinBERT
88+
The example [`SentimentAnalysisWithFinBERT.m`](./SentimentAnalysisWithFinBERT.m) uses the FinBERT sentiment analysis model to classify sentiment for a handful of example financial sentences.
89+
90+
## Example: Masked Token Prediction with BERT and FinBERT
91+
The examples [`LanguageModelingWithBERT.m`](./LanguageModelingWithBERT.m) and [`LanguageModelingWithFinBERT.m`](./LanguageModelingWithFinBERT.m) demonstrate the language models predicting masked tokens.

0 commit comments

Comments
 (0)