-
-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add columbia ordering to management command #4325
Conversation
…e combined opinions
…ring, update log messages, update typing
This is the list of ids of the columbia clusters that have more than one opinion and have columbia in their source: _SELECT_search_opinioncluster_id_FROM_search_opinioncluster_LEFT_202408260949.csv This is the query I used to get the list:
|
cl/corpus_importer/management/commands/update_opinions_order.py
Outdated
Show resolved
Hide resolved
cl/corpus_importer/management/commands/update_opinions_order.py
Outdated
Show resolved
Hide resolved
update doctstrings
…columbia-ordering
refactor the ngram generator change function name for getting cleaned columbia text
Thanks @quevon24 this looks good to me - |
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
This PR adds the functionality to update_opinions_order command to fill ordering_key field for columbia opinions.
It will search for clusters that include columbia in their source (Z, ZL, ZU, ZLU, etc), clusters with more than one opinion, and whose opinion have a file assigned in the local_path field.
The command to run it:
docker exec -it cl-django python /opt/courtlistener/manage.py update_opinions_order --action sort-columbia --xml-dir /opt/courtlistener/cl/assets/columbia
We need to place the xml files in a directory and then pass the path to the command, which will allow us to read the xml assigned to the opinion, this is essential because we rely on the original xml to infer the correct order because some opinions were created in the wrong order due the old columbia importer.
To test it you can clone this clusters:
docker exec -it cl-django python /opt/courtlistener/manage.py clone_from_cl --type search.OpinionCluster --id 9079419 4226317 8279828 8237637 8041148 4233235 4223647 8041161 8040857 8040920
And paste this files: test_xml.zip in your local environment in this location: /opt/courtlistener/cl/assets/columbia
And then run the command exactly as i put it above.