Big data learning curve

I started teaching myself to code a few months back. I’ve given a lot of thought to what I want to achieve as a lawyer by learning to code. Yes, I am learning to make useful blockchain tools, but just as exciting, I’m gaining new pathways for advocacy through data analysis. The competencies I am building are: (1) drafting and problem-solving in the financial blockchain world; and (2) advocacy using big data. How to do that?

I started out, against internet advice, trying to teach myself C++. This reminded me of the sixth-grade community Apple II and why I avoided coding in the first place. The granular, low-level programming languages do not yet inspire my mind with their potential. As a beginner, I need more instant gratification. Switching to Javascript and Python, I started to see tools that could help me as a lawyer. I chose data science as a competency goal because I want to use my advocacy skills to create persuasive narrative from data, so  I set my sights on Hadoop: a powerful, decentralized data storage and analysis tool.  A scroll through a few demonstrations sets my brain tingling with possibility and excitement. This is an effective tool for advocacy — maybe even a fun one. In Hadoop and the ecosystem of products developed around it, I see the potential for legal analysis writ large: I can think like a master using a bigger brain to do the hard stuff. I can imagine relationships and discover patterns and put them into narrative forms that work alone or in tandem with English, like images or code.

Already, coding skills are helping me refine the way I envision law as an interconnected, functioning machine. Imagine as a lawyer if you had the potential to sift through more information than you ever imagined to see patterns and make predictions. How would technology that allowed Google to predict localized flu epidemics help predict the outcome of a case? What if you had access to the kind of big-picture analytical perspective that exposed both drug trafficking patterns and investigator fraud in the Silk Road case ?  What new answers will we find to old questions in accessible data?

For example, take the evidentiary idea of proving weather by judicial notice.  For centuries, weather has not been subject to much evidentiary dispute because there was not much available data other than personal recollection and possibly an official report. In practice, I found that weather had a profound impact on small business contracts and insurance cases.  It came up in litigation frequently and I found that by gathering weather data and creating persuasive narrative around it, I could win or settle cases. For example, I once defended a contractor’s late performance under a construction contract using NOAA measurements regarding rain and river height that showed it was raining and/or flooding most days during the performance period. This tended to excuse my client’s performance and to refute my narrative, the other side would have to hire an expert witness. The case settled at the mediation in which I presented the data. Weather data is scientific evidence like any other and using it as a lawyer to thin-slice a narrative out of the facts is data analysis on a small scale.

Data is growing exponentially. With Hadoop and its ecosystem, we can thin-slice more data with a bigger brain to produce narratives that capture the imagination in multiple dimensions.  I’m using Cloudera‘s tutorials – the introductory courses are free and more advanced or in-person training is available for a fee.