Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eyecite Fails to Parse Complex Citations Correctly #185

Open
flooie opened this issue Oct 8, 2024 · 0 comments
Open

Eyecite Fails to Parse Complex Citations Correctly #185

flooie opened this issue Oct 8, 2024 · 0 comments

Comments

@flooie
Copy link
Contributor

flooie commented Oct 8, 2024

Recently, @anseljh highlighted missing citations in CourtListener (CL), and while investigating, I encountered some challenging parsing issues—likely edge cases.

For example, consider the following from Jasmine v. Superior Court:

This is a pure question of law, which we address without deference to the trial court’s ruling. (See In re K.F. (2009) 173 Cal.App.4th 655, 661 [92 Cal.Rptr.3d 784]; Yield Dynamics, Inc. v. TEA Systems Corp. (2007) 154 Cal.App.4th 547, 558 [66 Cal.Rptr.3d 1].)”

Problem:

Eyecite struggles to correctly parse this structure. There are two cases here, each with two citations (one parallel for each):

In re K.F. Citations:

1.	In re K.F. (2009) 173 Cal.App.4th 655, 661
2.	[92 Cal.Rptr.3d 784]

Yield Dynamics Citations:

1.	Yield Dynamics, Inc. v. TEA Systems Corp. (2007) 154 Cal.App.4th 547, 558
2.	[66 Cal.Rptr.3d 1]

Results from get_citations:

When parsing the string, Eyecite produces the following four citations:

1.	FullCaseCitation('173 Cal.App.4th 655', ...)
2.	FullCaseCitation('92 Cal.Rptr.3d 784', ...)
3.	FullCaseCitation('154 Cal.App.4th 547', ...)
4.	FullCaseCitation('66 Cal.Rptr.3d 1', ...)

Now each of these is correct as to the citation- but it fails down when it tries to include the date. This is where the wrench lands with our citation annotator.

  1. Date Issues:
    • The date for In re K.F. is incorrectly assigned the year of Yield Dynamics (2007). This happens because Eyecite is not separating the citations appropriately.

  2. Plaintiff/Defendant Parsing:

    • In cases like In re K.F., where there are no explicit plaintiff or defendant, Eyecite struggles with parsing parties correctly. For example, “K.F.” is being treated as a defendant.
    • Similarly, in Yield Dynamics, “Inc.” is assigned as the plaintiff, when it’s part of the full title of the case.

  3. Extra Data Repairing:
    • There’s an issue with the “extra” field being populated with the following citation information.

I think we need to add

  1. More Sophisticated Citation Boundary Detection or atleast use semicolons more effectively.
  2. Update better party parsing to handle cases without a plaintiff or defendant ex . In Re. KF
    
  3. Add a new pattern for this (maybe common) pattern of TITLE (YEAR) CITATION - we see in California.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant