Skip to content
This repository has been archived by the owner on Oct 31, 2022. It is now read-only.

Flaskapp #90

Open
wants to merge 90 commits into
base: finetuning
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
5b64684
update README
WuTheFWasThat Feb 18, 2019
6dab221
reorganize and add temp 0.7
WuTheFWasThat Feb 19, 2019
aae26ab
add license
WuTheFWasThat Feb 20, 2019
fc0ee6d
add conditional samples
WuTheFWasThat Feb 20, 2019
825aa3d
separate out tensorflow install
WuTheFWasThat Feb 20, 2019
92ce9f2
shuffle headings
WuTheFWasThat Feb 20, 2019
bf43e73
more warning
WuTheFWasThat Feb 20, 2019
23ed990
instructinos mention git clone
WuTheFWasThat Feb 20, 2019
99af6d7
Add a Dockerfile and document usage in README
madisonmay Feb 14, 2019
2cf46d9
fixed unconditional sampling reproducibility issue
Feb 20, 2019
946facf
fixed seed arg to ensure reproducibility in conditional-samples model
Feb 20, 2019
b6f943d
update readme
WuTheFWasThat Feb 20, 2019
a3aa7de
add conditional samples with default settings
WuTheFWasThat Feb 21, 2019
68bf7a0
add .gitattributes file to ensure files copied to docker container ha…
Feb 21, 2019
c5b9c89
Minor: update readme
natemurthy Feb 21, 2019
c314dda
Minor: update readme
natemurthy Feb 27, 2019
ed49f03
Add documentation for help flags (#81)
ArmaanBhullar Feb 27, 2019
9d1e704
slight fix to batch size description
WuTheFWasThat Feb 27, 2019
0465394
updates
WuTheFWasThat Feb 28, 2019
d1fc873
Add finetuning code.
Mar 3, 2019
1fba31f
chmod +x
Mar 3, 2019
dfca3cf
Add finetuning instructions
Mar 3, 2019
9423776
Fix sample generation with batch_size greater than 1.
Mar 3, 2019
8eb6793
Python download script (#89)
webproduktion01 Mar 4, 2019
ed0dedc
update download stuff
WuTheFWasThat Mar 4, 2019
953530f
update readme with usage caveats and calls for research
WuTheFWasThat Mar 6, 2019
79a246a
add contributors md and move dev docs out
WuTheFWasThat Mar 6, 2019
8637828
fix for windows (thanks to chrothenbach)
WuTheFWasThat Mar 7, 2019
3e18729
Add training script with Horovod support
tlkh Mar 18, 2019
ec16bad
Fix typo in train command in README
tlkh Mar 18, 2019
0bad9e4
Added instructions for training using Horovod
tlkh Mar 18, 2019
d14501a
Update CONTRIBUTORS.md
WuTheFWasThat Mar 18, 2019
ef62678
Merge pull request #2 from tlkh/finetuning
nshepperd Mar 19, 2019
c465071
autoformat
Mar 4, 2019
1e32b10
Combine input text files with <|endoftext|> delimiter to ensure there…
Mar 19, 2019
3a3ce65
Write losses to summary file for tensorboard.
Mar 20, 2019
d5b387b
Add learning rate as command line flag.
Mar 20, 2019
b106d0a
Use argparse instead of fire in train.py.
Mar 20, 2019
2044d13
Fix encode.py
Mar 21, 2019
a359a34
Add gradient accumulation with default of 5 minibatches
Mar 21, 2019
8738950
Merge remote-tracking branch 'origin/master' into finetuning
Mar 25, 2019
eda8777
Turn off gradient accumulation by default, it shouldn't be needed.
May 2, 2019
0503b1b
updates for 345M model
WuTheFWasThat May 3, 2019
b5ef71a
reference dataset
WuTheFWasThat May 3, 2019
dd75299
remove samples
WuTheFWasThat May 3, 2019
47df6da
Add gradient checkpointing and another optimization necessary to allo…
May 4, 2019
c46ed99
Add "validation" loss calculation.
May 4, 2019
941a762
Add toposort to requirements
Tenoke May 5, 2019
13c5412
Merge pull request #3 from Tenoke/finetuning
May 6, 2019
3985cc7
Add option to use SGD for optimizer
May 14, 2019
7fc2a44
Record learning rate in tensorboard logs
May 14, 2019
a464925
Add text in README for --optimizer flag
May 14, 2019
ae535b6
Reduce default learning rate of train.py.
May 14, 2019
2d4fd0c
Merge remote-tracking branch 'origin/master' into finetuning
May 14, 2019
6a77a7b
New feature: add noise to network inputs to regularize against overre…
May 15, 2019
87fe3d7
Add top-p sampling
May 15, 2019
e99ee37
Add top_p to interactive_conditional_samples.py and generate_uncondit…
May 15, 2019
2b24145
fix typo in top_p
May 15, 2019
6c1f21d
Fix top_p sampling for batch_size>1
May 15, 2019
e5c5054
allow models to be in a separate folder via models_dir argument (#129)
memo May 16, 2019
c0859d7
Fix TODO in sample.sample_sequences- Avoid 'leaving last token calcul…
albertwujj May 31, 2019
41a6793
Update README.md
christopherhesse Jul 27, 2019
e937879
Merge pull request #161 from openai/christopherhesse-patch-1
christopherhesse Jul 27, 2019
cca7144
Updated README.md
biranchi2018 Aug 15, 2019
cb41537
add model card
jackclarksf Aug 20, 2019
f35fa1d
push 774M model
WuTheFWasThat Aug 20, 2019
ac5d522
nucleus sampling
WuTheFWasThat Aug 27, 2019
a070f38
Merge pull request #22 from biranchi2018/biranchi2018-patch-1
Aug 27, 2019
50fa3b6
Add note to install cudnn, re https://github.com/nshepperd/gpt-2/issu…
Jun 16, 2019
b7cda3f
Add flag to set encoding for text reading and writing, defaulting to …
Jul 20, 2019
fbae7db
update readmes
WuTheFWasThat Nov 5, 2019
d98291d
update model card
jackclarksf Nov 5, 2019
ebdba20
updated g_form contact
jackclarksf Nov 26, 2019
0f97760
Update LICENSE
cookee12 Jan 3, 2020
03fce0a
Update README.md
WuTheFWasThat Jan 3, 2020
0574c57
delete
WuTheFWasThat Jan 4, 2020
a74da5d
move to azure
WuTheFWasThat Dec 2, 2020
fdd5ecf
Merge branch 'master' into finetuning
Mar 2, 2021
9741323
Fix models_dir issue #76
Mar 6, 2021
4556dd2
Delete train-horovod.py, which is unmaintained
Mar 6, 2021
2de5d1b
Fixes to support tensorflow v2.x. Training should still work in v1.x.
Mar 6, 2021
ffc54c7
Add tensor rematerialization.
Mar 16, 2021
29ce412
Update twremat.cabal for ghc 9.0
Apr 1, 2021
c002e8f
first commit
napalmj Apr 13, 2022
9c15f97
repositioned items
napalmj Apr 13, 2022
4a2a362
added flask app
napalmj Apr 13, 2022
b05775f
removed env
napalmj Apr 13, 2022
d17db8e
git ignore
napalmj Apr 13, 2022
1f4a69d
deleted text files
napalmj Apr 13, 2022
9eb1d27
ignoring training text
napalmj Apr 13, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .flaskenv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
FLASK_APP=flaskapp
FLASK_ENV=development
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
__pycache__
models/
.env
48 changes: 0 additions & 48 deletions README.md

This file was deleted.

17 changes: 0 additions & 17 deletions download_model.sh

This file was deleted.

1 change: 1 addition & 0 deletions flaskapp/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
models/
10 changes: 10 additions & 0 deletions flaskapp/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from flask import Flask
from .routes import generator

def create_app(config_file="settings.py"):
app = Flask(__name__, static_url_path="/tmp", static_folder="tmp")

app.config.from_pyfile(config_file)

app.register_blueprint(generator)
return app
Binary file added flaskapp/__pycache__/__init__.cpython-38.pyc
Binary file not shown.
Binary file added flaskapp/__pycache__/encoder.cpython-38.pyc
Binary file not shown.
Binary file added flaskapp/__pycache__/generator.cpython-38.pyc
Binary file not shown.
Binary file added flaskapp/__pycache__/model.cpython-38.pyc
Binary file not shown.
Binary file added flaskapp/__pycache__/routes.cpython-38.pyc
Binary file not shown.
Binary file added flaskapp/__pycache__/sample.cpython-38.pyc
Binary file not shown.
5 changes: 3 additions & 2 deletions src/encoder.py → flaskapp/encoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,10 @@ def decode(self, tokens):
return text

def get_encoder(model_name):
with open(os.path.join('models', model_name, 'encoder.json'), 'r') as f:
currentPath = os.path.dirname(__file__) + "/models" + "/" + model_name
with open(currentPath + '/encoder.json', 'r') as f:
encoder = json.load(f)
with open(os.path.join('models', model_name, 'vocab.bpe'), 'r', encoding="utf-8") as f:
with open(currentPath + '/vocab.bpe', 'r', encoding="utf-8") as f:
bpe_data = f.read()
bpe_merges = [tuple(merge_str.split()) for merge_str in bpe_data.split('\n')[1:-1]]
return Encoder(
Expand Down
67 changes: 67 additions & 0 deletions flaskapp/generator.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/usr/bin/env python3

import fire
import json
import os
import numpy as np
import tensorflow.compat.v1 as tf

from flaskapp import model
from flaskapp import sample
from flaskapp import encoder

class AI:
def generate_text(self, text_input, model_name="124M_alice", length=100):
seed=None
nsamples=1
batch_size=1
temperature=1
top_k=40
top_p=1

self.response = ""

currentPath = os.path.dirname(__file__) + "/models" + "/" + model_name

if batch_size is None:
batch_size = 1
assert nsamples % batch_size == 0

enc = encoder.get_encoder(model_name)
hparams = model.default_hparams()
with open(currentPath + '/hparams.json') as f:
hparams.override_from_dict(json.load(f))

if length is None:
length = hparams.n_ctx // 2
elif length > hparams.n_ctx:
raise ValueError("Can't get samples longer than window size: %s" % hparams.n_ctx)

with tf.Session(graph=tf.Graph()) as sess:
context = tf.placeholder(tf.int32, [batch_size, None])
np.random.seed(seed)
tf.set_random_seed(seed)
output = sample.sample_sequence(
hparams=hparams, length=length,
context=context,
batch_size=batch_size,
temperature=temperature, top_k=top_k, top_p=top_p
)

saver = tf.train.Saver()
ckpt = tf.train.latest_checkpoint(currentPath)
saver.restore(sess, ckpt)

context_tokens = enc.encode(text_input)
generated = 0
for _ in range(nsamples // batch_size):
out = sess.run(output, feed_dict={
context: [context_tokens for _ in range(batch_size)]
})[:, len(context_tokens):]
for i in range(batch_size):
generated += 1
text = enc.decode(out[i])
self.response = text
return self.response

ai = AI()
25 changes: 18 additions & 7 deletions src/model.py → flaskapp/model.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
import numpy as np
import tensorflow as tf
from tensorflow.contrib.training import HParams
import tensorflow.compat.v1 as tf

class HParams(object):
def __init__(self, **kwargs):
for (k, v) in kwargs.items():
setattr(self, k, v)

def override_from_dict(self, kwargs):
for (k, v) in kwargs.items():
setattr(self, k, v)


def default_hparams():
return HParams(
Expand Down Expand Up @@ -28,7 +37,7 @@ def gelu(x):
def norm(x, scope, *, axis=-1, epsilon=1e-5):
"""Normalize to mean = 0, std = 1, then do a diagonal affine transform."""
with tf.variable_scope(scope):
n_state = x.shape[-1].value
n_state = shape_list(x)[-1]
g = tf.get_variable('g', [n_state], initializer=tf.constant_initializer(1))
b = tf.get_variable('b', [n_state], initializer=tf.constant_initializer(0))
u = tf.reduce_mean(x, axis=axis, keepdims=True)
Expand Down Expand Up @@ -91,7 +100,7 @@ def mask_attn_weights(w):
def multihead_attn(q, k, v):
# q, k, v have shape [batch, heads, sequence, features]
w = tf.matmul(q, k, transpose_b=True)
w = w * tf.rsqrt(tf.cast(v.shape[-1].value, w.dtype))
w = w * tf.rsqrt(tf.cast(shape_list(v)[-1], w.dtype))

w = mask_attn_weights(w)
w = softmax(w)
Expand All @@ -114,15 +123,15 @@ def multihead_attn(q, k, v):

def mlp(x, scope, n_state, *, hparams):
with tf.variable_scope(scope):
nx = x.shape[-1].value
nx = shape_list(x)[-1]
h = gelu(conv1d(x, 'c_fc', n_state))
h2 = conv1d(h, 'c_proj', nx)
return h2


def block(x, scope, *, past, hparams):
with tf.variable_scope(scope):
nx = x.shape[-1].value
nx = shape_list(x)[-1]
a, present = attn(norm(x, 'ln_1'), 'attn', nx, past=past, hparams=hparams)
x = x + a
m = mlp(norm(x, 'ln_2'), 'mlp', nx*4, hparams=hparams)
Expand All @@ -144,7 +153,7 @@ def positions_for(tokens, past_length):
return expand_tile(past_length + tf.range(nsteps), batch_size)


def model(hparams, X, past=None, scope='model', reuse=False):
def model(hparams, X, past=None, scope='model', reuse=tf.AUTO_REUSE):
with tf.variable_scope(scope, reuse=reuse):
results = {}
batch, sequence = shape_list(X)
Expand All @@ -162,6 +171,8 @@ def model(hparams, X, past=None, scope='model', reuse=False):
assert len(pasts) == hparams.n_layer
for layer, past in enumerate(pasts):
h, present = block(h, 'h%d' % layer, past=past, hparams=hparams)
if layer == 10:
tf.add_to_collection('checkpoints', h)
presents.append(present)
results['present'] = tf.stack(presents, axis=1)
h = norm(h, 'ln_f')
Expand Down
3 changes: 3 additions & 0 deletions flaskapp/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Python==3.8.10
Flask==2.1.1
Werkzeug==2.0.3
18 changes: 18 additions & 0 deletions flaskapp/routes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from inspect import Parameter
from django.shortcuts import render
from flask import Blueprint, render_template, request, redirect
from .generator import ai

generator = Blueprint('generator', __name__)

@generator.route('/')
def index():
# parameter = request.form['parameter']
return render_template('index.html')

@generator.route('/analyze', methods=['POST'])
def analyze():
title = request.form['title']
text = ai.generate_text(title)

return render_template('index.html', text=text)
93 changes: 93 additions & 0 deletions flaskapp/sample.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import tensorflow.compat.v1 as tf

from flaskapp import model

def top_k_logits(logits, k):
if k == 0:
# no truncation
return logits

def _top_k():
values, _ = tf.nn.top_k(logits, k=k)
min_values = values[:, -1, tf.newaxis]
return tf.where(
logits < min_values,
tf.ones_like(logits, dtype=logits.dtype) * -1e10,
logits,
)
return tf.cond(
tf.equal(k, 0),
lambda: logits,
lambda: _top_k(),
)


def top_p_logits(logits, p):
with tf.variable_scope('top_p_logits'):
logits_sort = tf.sort(logits, direction='DESCENDING')
probs_sort = tf.nn.softmax(logits_sort)
probs_sums = tf.cumsum(probs_sort, axis=1, exclusive=True)
logits_masked = tf.where(probs_sums < p, logits_sort, tf.ones_like(logits_sort)*1000) # [batchsize, vocab]
min_logits = tf.reduce_min(logits_masked, axis=1, keepdims=True) # [batchsize, 1]
return tf.where(
logits < min_logits,
tf.ones_like(logits, dtype=logits.dtype) * -1e10,
logits,
)


def sample_sequence(*, hparams, length, start_token=None, batch_size=None, context=None, temperature=1, top_k=0, top_p=0.0):
if start_token is None:
assert context is not None, 'Specify exactly one of start_token and context!'
else:
assert context is None, 'Specify exactly one of start_token and context!'
context = tf.fill([batch_size, 1], start_token)

def step(hparams, tokens, past=None):
lm_output = model.model(hparams=hparams, X=tokens, past=past, reuse=tf.AUTO_REUSE)

logits = lm_output['logits'][:, :, :hparams.n_vocab]
presents = lm_output['present']
presents.set_shape(model.past_shape(hparams=hparams, batch_size=batch_size))
return {
'logits': logits,
'presents': presents,
}

with tf.name_scope('sample_sequence'):
def body(past, prev, output):
next_outputs = step(hparams, prev, past=past)
logits = next_outputs['logits'][:, -1, :] / tf.to_float(temperature)
if top_p > 0.0:
logits = top_p_logits(logits, p=top_p)
else:
logits = top_k_logits(logits, k=top_k)
samples = tf.multinomial(logits, num_samples=1, output_dtype=tf.int32)
return [
next_outputs['presents'] if past is None else tf.concat([past, next_outputs['presents']], axis=-2),
samples,
tf.concat([output, samples], axis=1)
]

past, prev, output = body(None, context, context)

def cond(*args):
return True

_, _, tokens = tf.while_loop(
cond=cond, body=body,
maximum_iterations=length - 1,
loop_vars=[
past,
prev,
output
],
shape_invariants=[
tf.TensorShape(model.past_shape(hparams=hparams, batch_size=batch_size)),
tf.TensorShape([batch_size, None]),
tf.TensorShape([batch_size, None]),
],
back_prop=False,
)

return tokens
4 changes: 4 additions & 0 deletions flaskapp/settings.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
import os

ADMIN_USERNAME=os.environ.get('ADMIN_USERNAME')
ADMIN_PASSWORD=os.environ.get('ADMIN_PASSWORD')
23 changes: 23 additions & 0 deletions flaskapp/templates/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<!DOCTYPE html>

<html>
<head>
<style>

</style>
</head>
<body>
<div>
<form method="POST" action="{{ url_for('generator.analyze') }}">
<div>
<label>Title</label>
<input type="text" placeholder="Enter prompt" name="title"/>
</div>
<button type="submit">Submit</button>
</form>
<p>
{{ text }}
</p>
</div>
</body>
</html>
6 changes: 6 additions & 0 deletions gpt-2/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# convert to OS line endings on checkout, back to LF on commit
* text=auto

# ensure anything copied to the container has unix style line endings
*.sh text eol=lf
requirements.txt text eol=lf
Loading