Legal Semantic Analysis
My ultimate goal as a lawyer and law professor is to extract reasoning patterns from legal documents, in order to use them to formulate better arguments in future cases. This involves developing annotations that can be used to develop and test analytics that can power useful web applications. It is also important that we provide documentation for both annotation and analytics.
Annotation
Annotation begins with a document in plain text. We add to this semantic annotation, which adds layers of meaning to specific spans of text. Such annotation helps to make meaning explicit, in a way that is also available for computation.
For legal documents, the meaning that lawyers understand is legal in nature. Words and phrases may have different meanings than their non-legal uses, and statements made with sentences or clauses might have distinctive implications in law. Moreover, the same words, phrases, clauses, sentences, and sets of sentences may have varying significance or roles, depending upon the legal context. We capture these many layers of meaning through “legal semantic analysis.” We define the legal semantic categories by publishing protocols for correct annotation.
Lawyers learn how to interpret the legal significance of a document through law school and experience. When they read a court decision, for example, they appreciate layers of meaning that would be unknown to many non-legal readers. Moreover, lawyers can read the same document differently, depending upon their purpose or goal. A lawyer looking for arguments to make in a new case reads court decisions in a way geared toward identifying successful and unsuccessful arguments in those decided cases.
AI Analytics
How can artificial intelligence (AI) help such a lawyer? First, AI analytics can:
- locate and retrieve similar cases that addressed the same legal issues with similar types of evidence;
- automatically annotate those retrieved decisions, drawing the lawyer’s attention to those groups of sentences that show the relevant reasoning; and
- recommend to the lawyer ways to develop new arguments for the lawyer’s new case, based on those reported cases. It is also important to publish and document our AI analytics.
Such AI analytics are under development, incorporating an ever-expanding suite of tools, such as deep machine learning and word embeddings. Such technologies are powerful, but also under-utilized. The biggest barrier today is not our technology, but our understanding of what we are looking for - the layers of legal meaning that lawyers “see” when they read legal documents. What we need is a semantic analysis that is geared toward AI legal applications. This GitHub repo is dedicated to developing and documenting such a legal semantic analysis.