2 nd International Workshop on Network Analysis in Law

2nd International Workshop on “Network Analysis in Law” 27 th Wednesday December 10 th 2014 In conjunction with JURIX 2014: International Conference...
Author: Osborn Young
26 downloads 2 Views 7MB Size
2nd International Workshop on “Network Analysis in Law”

27 th

Wednesday December 10 th 2014 In conjunction with JURIX 2014: International Conference on Legal Knowledge and Information Systems, Krakow, Poland

Radboud Winkels Nicola Lettieri

Preface This volume contains the papers presented at NAiL2014: the 2nd International Workshop on Network Analysis in Law held on December 10, 2014 in Krakow in conjunction with the 27th JURIX conference on Legal Knowledge and Information Systems. In 2013 we organized the first workshop on Network Analysis in Law at the 14th ICAIL conference in Rome. Eight papers were presented on various aspects of this field and after the conference extended versions of these eight, plus an extended version of a paper from the main conference were collected in a volume of the ”Law Science Technology” series (ISBN 9788849527698). This second workshop again aimed to bring together researchers from computational social science, computational legal theory, network science and related disciplines in order to discuss the use and usefulness of network analysis in the legal domain. We invited papers and demonstrations of original works on the following aspects of network analysis in the legal field: This second workshop again aims to bring together researchers from computational social science, computational legal theory, network science and related disciplines in order to discuss the use and usefulness of network analysis in the legal domain. We invite papers and demonstrations of original works on the following aspects of network analysis in the legal field: 1. Analysis and visualization of networks of people and institutions: law is made by people, about and for people and institutions. These people or institutions form networks, be it academic scholars, criminals or public bodies and these networks can be detected, mapped, analysed and visualised. Can we better study institutions and their activities by analysing their internal structure or the network of their relations? Does it help in finding the ’capo di tutti i capi’ in organized crime? 2. Analysis and visualization of the network of law: law itself forms networks. Sources of law refer to other sources of law and together constitute (part of) the core of the legal system. In the same way as above, we can represent, analyse and visualise this network. Can it help in determining the authority of case law or the likelihood a decision will be overruled? Does it shed light on complex or problematic parts of legislation? Is it possible to exploit networks visualization to support legal analysis and information retrieval? 3. Combination of the first and second aspects: people or institutions create sources of law or appear in them: Research on the network of one may shed light on the other. Two examples: – Legal scholars write commentaries on proposed legislation or court decisions. Sometimes they write these together. These commentaries may provide information on the network of scholars; the position of an author in the network of scholars may provide information on the authority of the comment. v

– The application of network analysis techniques to court decisions and proceedings is proving to be helpful in detecting criminal organizations and in analysing their structure and evolution over time. Submissions were subject to a light review process on appropriateness for this call, originality of the research described and technical quality. In the end 13 submissions were accepted and presented, plus one paper that was accepted as full paper for the main conference. All accepted authors will be asked to submit improved and extended versions of their paper for post workshop publication after another review round.

December 5, 2014 Amsterdam

Radboud Winkels Nicola Lettieri

vi

Table of Contents Statutory Network Analysis plus Information Retrieval . . . . . . . . . . . . . . . . Kevin Ashley, Elizabeth Ferrell Bjerke, Margaret Potter, Hasan Guclu, Jaromir Savelka and Matthias Grabmair

1

Document-enhancement and network analysis for criminal investigation: an holistic approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicola Lettieri, Delfina Malandrino and Luca Vicidomini

2

The Case Law of the Italian Constitutional Court between Network Theory and Philosophy of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tommaso Agnoloni and Ugo Pagallo

3

Prominent Actors in Italian Civil Judiciary: a Social Network Analysis study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gabriele Rinaldi and Giacomo Fiumara

4

Reference Structures of National Constitutions . . . . . . . . . . . . . . . . . . . . . . . Bart Karstens, Marijn Koolen, Giuseppe Dari Mattiacci, Rens Bod and Tom Ginsburg

5

35 years of Multilateral Environmental Agreements Ratification: a Network Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Romain Boulet, Ana Flavia Barros-Platiau and Pierre Mazzega

6

Network analysis as an aid to legal interpretation can counting and drawing rules help lawyers understand the context of those rules? . . . . . . . John Fitzgerald

7

Semi-Automatic Construction of Skeleton Concept Maps from Case Judgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alexander Boer and Bas Sijtsma

8

Outcome networks for policy analysis, with an application to a case study in labor law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remo Pareschi, Franco Toffoletto and Paolino Zica

9

A Method to Compare the Complexity of Legal Acts . . . . . . . . . . . . . . . . . . T˜ onu Tamme, Leo V˜ ohandu and Ermo T¨ aks Nets of Legal Information Connecting and Displaying Heterogeneous Legal Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nicola Lettieri, Sebastiano Faro, Luca Vicidomini and Antonio Altamura Why do you quote me? Citation of Superior Court orders in the Sicilian courts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Deborah De Felice, Giuseppe Giura and Vilhelm Verendel vii

19

20

21

Towards a Legal Recommender System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Radboud Winkels A Model of Legal Systems as Evolutionary Networks: Normative Complexity and Self-organization of Clusters of Rules . . . . . . . . . . . . . . . . . Carlo Garbarino

viii

22

32

Program Committee Michael Bommarito Daniele Bourcier Pompeu Casanovas Sebastiano Faro Giacomo Fiumara Rinke Hoekstra Daniel Katz Marc Lauritsen Nicola Lettieri Innar Liiv Delfina Malandrino Boulet Romain Marc van Opijnen Radboud Winkels

ReInventLaw Laboratory CERSA l’Universit´e de Paris II UAB ITTIG CNR University of Messina Vrije Universiteit Michigan State University Capstone Practice University of Sannio Law School Tallinn University of Technology University of Salerno University of Lyon KOOP University of Amsterdam

ix

2nd International Workshop on Network Analysis in Law 10th December 2014, Krakow, Poland 9:15 Welcome and Opening Radboud Winkels and Nicola Lettieri 9:35 Statutory Network Analysis plus Information Retrieval Kevin Ashley, Elizabeth Ferrell Bjerke, Margaret Potter, Hasan Guclu, Jaromir Savelka and Matthias Grabmair Document-enhancement and network analysis for criminal investigation: an holistic 9:55 approach Nicola Lettieri, Delfina Malandrino and Luca Vicidomini The Case Law of the Italian Constitutional Court between Network Theory and 10:15 Philosophy of Information Tommaso Agnoloni and Ugo Pagallo 10:35 Prominent Actors in Italian Civil Judiciary: a Social Network Analysis study Gabriele Rinaldi and Giacomo Fiumara 10:55 Coffee Break 11:20 Reference Structures of National Constitutions Bart Karstens, Marijn Koolen, Giuseppe Dari Mattiacci, Rens Bod and Tom Ginsburg 11:40 35 years of Multilateral Environmental Agreements Ratification: a Network Analysis Romain Boulet, Ana Flavia Barros-Platiau and Pierre Mazzega Network analysis as an aid to legal interpretation – can counting and drawing rules help 12:00 lawyers understand the context of those rules? John Fitzgerald 12:20 Semi-Automatic Construction of Skeleton Concept Maps from Case Judgments Alexander Boer and Bas Sijtsma 12:40 Outcome networks for policy analysis, with an application to a case study in labor law Remo Pareschi and Paolino Zica 13:00 Lunch 14:00 A Method to Compare the Complexity of Legal Acts Tõnu Tamme, Leo Võhandu and Ermo Täks 14:20 Nets of Legal Information Connecting and Displaying Heterogeneous Legal Sources 14:40 14:55 15:10 15:30

Nicola Lettieri, Sebastiano Faro, Luca Vicidomini and Antonio Altamura Why do you quote me? Citation of Superior Court Deborah De Felice, Giuseppe Giura and Vilhelm Verendel Towards a Legal Recommender System Radboud Winkels, Alexander Boer, Bart Vredebregt and Alexander van Someren A Model of Legal Systems as Evolutionary Networks: Normative Complexity and Selforganization of Clusters of Rules Carlo Garbarino Closing

Statutory  Network  Analysis  plus  Information  Retrieval     b  *

a

a

a

Kevin  D.  Ashley ,  Elizabeth  Ferrell  Bjerke ,  Margaret  A.  Potter ,  Hasan  Guclu ,     b  

Jaromir  Savelka ,  Matthias  Grabmair a   b  

b  

Graduate  School  of  Public  Health,  University  of  Pittsburgh,  USA      

Intelligent  Systems  Program,  School  of  Law,  University  of  Pittsburgh,  USA  

Abstract.  This  work  applies  network  analysis  to  explore  relationships  within  civil  networks  established  by  state   law  and  to  compare  both  visually  and  quantitatively  similarly-­‐purposed  legal  systems  across  state  and  federal   governments.   The   paper   describes   an   application   of   the   methodology   to   public   health   systems   for   emergency   purposes.  An  example  illustrates  how  the  analysis  informs  an  investigation  of  legally  mandated  communications   links   for   dealing   with   infectious   diseases   like   the   recent   Ebola   scenario   in   a   Texas   hospital.   The   network   diagrams   also   serve   as   visual   indexes   into   a   legal   information   database   enabling   users,   such   as   public   health   officials   working   in   field   offices,   to   quickly   answer   such   questions   as,   “What   regulations   establish   communications  links  between  government  public  health  agencies  and  hospitals?"    

1.  Introduction  to  Public  Health  Emergency  Tools:  LENA  and  the  Emergency  Law  Database   Law  is  an  important  component  of  public  health,   and  statutory  provisions  help  direct  which  entities,  or   “agents”,   must   work   collaboratively   within   the   public   health   system   (PHS).   Law   can   also   determine   which   functions   agents   must   perform,   including   surveilling   population   health,   identifying   cases   of   infectious   disease   and  developing  effective  disease  containment  interventions.     The   Center   for   Public   Health   Practice   and   the   Public   Health   Dynamics   Laboratory   at   Pitt   Public   Health   developed   two   tools   to   enable   public   health   practitioners,   policy   and   decision   makers,   and   emergency   management   professionals   to   better   understand   legally   directed   functions   guiding   the   preparation   for   and   response   to   emergencies   with   public   health   implications:   LENA   (http://www.phdl.pitt.edu/LENA/),   a   tool   creating  network  diagram  visualizations  of  legally  directed  relationships  between  PHS  agents  as  mandated  by   federal   or   state   law;   and,   the   Emergency   Law   Database   (http://www.phasys.pitt.edu/default.aspx.),   a   searchable  collection  of  the  full  text  of  the  relevant  emergency  laws.     LENA   applies   network   analysis   to   relationships   within   the   civil   networks   established   by   state   law   and   helps  to  compare  both  visually  and  quantitatively  similarly-­‐purposed  legal  systems  across  state  and  federal   governments.   In   this   work,   the   research   team   applied   LENA   to   a   developing   inventory   of   public   health   preparedness   laws   from   a   sample   of   twelve   U.S.   states1   and   the   federal   government,   which   had   been   numerically   coded   for   use   with   network   analysis.   A   complete   description   of   the   legal   preparedness   inventory   and  the  coding  methodology  appears  in  [16].     The   paper   describes   the   LENA   network   diagrams   and   analysis   and   the   Emergency   Law   Database   in   Section  2.    In  Section  3,  we  illustrate  how  the  network  analysis  informs  an  investigation  of  legally  mandated   communications  links  relative  to  an  infectious  disease  scenario  like  the  Ebola  situation  that  recently  unfolded   in  a  Texas  hospital.  The  network  diagrams  assist  in  such  investigations  and  serve  as  visual  indexes  into  the   legal   information   database   enabling   users,   such   as   public   health   agents   working   in   field   offices,   to   quickly   answer   questions   such   as,   “What   regulations   establish   communications   links   between   government   public   health  agencies  and  hospitals?"  We  survey  related  work  in  Section  4  and  conclude  in  Section  5.  

2.  LENA  Network  Diagrams  and  Analysis    As   explained   in   [16],   the   legal   directives   that   explicitly   addressed   infectious   disease   emergencies   were   represented  for  network  analysis  by  numeric  codes  within  nine  categories  including  the  following  four:     • Acting  Agent:  the  organization,  group,  or  individual  whose  activity  the  law  directs  (see  Table  1);   • Partner   Agent:   the   organization,   group   or   individual   with   which   the   law   defines   a   relationship   to   the   acting   PHS   acting   agent   (see   Table   1),   including   “none”   when   the   legal   directive   states   no   particular   partner;      *  Corresponding  author   Alaska (AK), California (CA), Florida (FL), Kansas (KS), Maryland (MD), New York (NY), North Dakota (ND), Ohio (OH), Pennsylvania (PA), Rhode Island (RI), Texas (TX), and Wisconsin (WI). Full texts of the statutes are posted on a searchable database (visit http://www.phasys.pitt.edu/default.aspx) and accessible from the LEgal Network Analyzer (LENA) applet (visit http://www.phdl.pitt.edu/LENA/). 1

• Action:   The   verb   characterizing   what   is   to   be   done.     Actions   ranged   from   the   very   specific   such   as   “vaccinate”  to  the  more  general  such  as  “require,”  “enact,”  and  “establish.”     • Goal:   The   product   or   result   of   an   Action.   A   typical   action-­‐goal   pairing   would   be   “establish/a   plan”   or   “require/a  training  program.”   This   study   focuses   on   two   kinds   of   agents:   (1)   Governmental   Public   Health   (GovPubHealth),   an   organization   and   its   divisions,   committees,   commissions,   or   authorities   designated   by   the   federal,   state,   or   local  government  having  the  usual  powers  and  duties  of  a  board  of  health  or  health  department  to  protect  and   preserve   the   public’s   health.   This   includes   epidemiologists,   medical   examiners,   and   coroners.   (2)   Hospital   (Hospital),  an  institution  that,  under  the  supervision  of  licensed  doctors,  primarily  provides  inpatient  services   for  individuals  who  are  sick,  injured,  or  disabled.  Other  types  of  agents  are  defined  in  [16].   All  coded  information  was  entered  into  Excel  spreadsheets  for  use  with  network  analysis  library  Igraph   [4].   Pajek   was   used   to   generate   network   visualizations   [1].   The   network   measurements   applied   include   density,  reciprocity,  inclusiveness,  degree,  strength,  hub  and  authority  [15,  18,  5,  3,  19,  10]  (see  Table  1).      

Table  1.  Statutory  Task  Network  Diagram  Measures   Measure   Definition   Density  (δ)   Total  number  of  connections  in  a  network,  expressed  as  a  proportion  of  the  maximum  possible   number  of  connections:    n(n-­‐1)/2  where  n  is  the  number  of  agents.  Density  measures  the  spread   of  legally  mandated  tasks  over  all  agents:  the  higher  the  density,  the  wider  the  sharing  of  the   tasks  among  the  agents.     Inclusiveness  (I)   Measures  the  number  of  agents  with  network  connections  (i.e.,  the  number  of  all  agents  minus   the  number  of  unconnected  agents).  Inclusiveness  measures  agents’  involvement  in  the  network:   higher  inclusiveness  indicates  that  more  possible  agents  are  involved.   Degree  (d)   Measures  each  agent’s  number  of  connections  with  other  agents.  An  agent  is  considered  “central”   if  it  has  high  degree.  For  the  purpose  of  this  analysis,  legal  statements  create  “directed  networks”   because  any  agent  may  either  act  toward  another  or  receive  action  from  another.  In  such  a   directed  network,  the  measure  of  degree  has  two  parts:  an  agent’s  in-­‐degree  (din)  is  the  number  of   connections  directed  to  it;  and  its  out-­‐degree  (dout)  is  the  number  of  connections  directed  from  it.   In  a  legal  network,  a  high-­‐degree  or  central  agent  is  a  relatively  important  contributor  so  that  its   absence  or  malfunction  would  probably  impair  the  response.   Strength  (s)   Measure  used  in  “weighted”  networks,  taking  account  of  the  frequency  of  agents’  pairings—both   incoming  (“in-­‐strength”)  (sin)  and  outgoing  (“out-­‐strength”)  (sout)  connections  of  each  agent.  For   each  agent,  the  out-­‐strength  is  the  number  of  outgoing  directives  (responsibilities),  i.e.,  the   frequency  of  citations  as  an  acting  agent,  and  the  in-­‐strength  is  the  number  of  incoming  directives   (load)  from  other  agents,  i.e.,  the  frequency  of  citations  as  a  partnering  agent.  Thus,  high  average   strength  networks  have  relatively  more  directives  among  its  agents,  suggesting  much  inter-­‐ relatedness.   Reciprocity  (r)   Ratio  of  reciprocal  connection  in  the  network.  This  shows  how  much  of  the  connections  go  both   ways  or  how  many  times  an  agent  is  an  acting  agent  and  at  the  same  time  it  is  a  partnering  agent   with  another  agent.   Hub  and  authority   Based  on  link  analysis  algorithm  that  rates  web  pages  in  the  Internet.    Hub  score  of  an  agent   scores  (h  and  a)   estimates  the  value  of  its  connections  to  other  agents  and  authority  score  of  an  agent  estimates   the  value  of  its  resources  attractive  to  others.  In  our  legal  network  framework  hub  score  refers  to   the  centrality  of  the  agent  as  an  acting  agent  and  authority  scores  refers  to  the  centrality  as  a   partnering  agent.  

The  results  of  legal  network  analysis  in  the  sample  states  and  the  federal  government  are  presented  both   visually  and  quantitatively.     The   network   maps   like   the   one   in   Figure   1   provide   visual   signals   for   important   characteristics   of   legal   networks   for   infectious   disease   response.   For   instance,   Figure   1   shows   the   legally   directed   network   of   agents   named   by   Texas   law   for   epidemic   emergencies   involving   infectious   diseases.   In   LENA   networks,   circular   nodes   represent   agents   in   the   PHS.   The   size   of   the   node   correlates   with   how   central   the   agent   is   in   the   network  –  the  larger  the  node,  the  more  central  role  an  agent  plays.    In  particular,  in  Figure  1,  the  relative  size   of   the   node   representing   an   agent   is   proportional   to   its   in-­‐strength   (as   acting   agent)   and   out-­‐strength   (as   partnering  agent).  For  instance,  in  Texas,   Governmental  Public  Health  is  a  high-­‐strength  agent  in  terms  of  in-­‐   and   out-­‐strength.     An   agent   that   does   not   have   any   legally   directed   function   in   a   particular   network   is   designated  as  an  “Isolated  Agent”  (e.g.,  Indian  Tribe  in  Figure  1).     A   line   connecting   agents,   or   an   “edge”,   shows   that   a   law   or   group   of   laws   directs   action   between   two   agents.  Blue  lines  are  unilateral  legal  directives  (i.e.,  laws  that  direct  one  agent  to  perform  a  function  with  a   partner   agent).     Pink   lines   are   bidirectional   legal   directives   (i.e.,   an   acting   agent   is   legally   directed   to   perform   a   function   or   multiple   functions   with   a   partner   agent   and   the   partner   agent,   in   turn,   is   legally   directed   to   perform  a  function  or  functions  with  the  acting  agent).  The  thickness  of  the  lines  represents  the  strength  of   2

the   connection   between   the   two   agents.   Thicker   ties   mean   that   there   are   more   legal   directives   requiring   interaction  between  the  two  agents  as  acting  and  partner  agent  pairs.     Users   can   create   PHS   networks   defined   by   five   categories,   which   are   found   under   the   LENA   Data   dropdown  menu.   Show  State.  PHS  networks  can  be  created  based  upon  the  laws  in  one  of  12  states  or  those  of  federal  government.   Emergency.    PHS  networks  can  be  created  for  non-­‐specific  emergencies  or  one  of  20  discrete  emergency  types  (such   as  epidemics,  bioterrorism  or  hurricanes).   Actions-­‐goals.  Users  can  select  all  action-­‐goals  (meaning  any  function  directing  the  agent  pairings)  or  one  of  7  specific   functions.   Purpose.   Networks   can   be   focused   on   those   during   each   of   the   phases   of   emergencies   (preparedness,   response,   management  or  recovery)  or  an  aggregate  of  all  4  phases.   Compare  State.  The  Compare  State  function  compares  the  similarities  and  differences  in  legal  directives  between  two   jurisdictions.    

For   example,   in   using   the   LENA   tool   to   construct   Figure   1,   the   legally   directed   network   in   Texas   for   epidemic   emergencies,  Texas  was  selected  as  the  Show  State,  Epidemic  as  the  emergency  type,  all  action-­‐goals  as  the   function,  and  all  purposes  (i.e.,  a  visualization  of  the  PHS  for  all  phases  of  epidemics).     As  is  apparent  in  Figure  1,  the  Texas  infectious  disease  legal  network  is  quite  dense,  reflecting  that  that   state’s   laws   correspond   to   more   than   2200   lines   of   code   representing   statutory   provisions   that   establish   linkages   for   PHS   infectious   disease   emergencies.     By   contrast,   PA’s   network   (not   shown)   is   visually   much   sparser  reflecting  its  considerably  lower  number  of  relevant  legal  directives.     Table  2  presents  an  example  of  some  quantitative  information  concerning  the  centrality  of  two  types  of   agents   in   four   states,   KS,   PA,   TX,   and   WI.   The   network   measurements   include   in-­‐degree,   out-­‐degree,   in-­‐ strength   and   out-­‐strength,   and   hub   and   authority   scores   of   Governmental   Public   Health   agencies   and   Hospitals.   In   general,   for   example   one   can   say   that   Hospitals   are   more   often   referred   to   as   acting   agents   than   as   partnering   agents   in   Texas,   in   contrast   to   the   other   three   states,   and   that   Governmental   Public   Health   agents  have  duties  as  both  acting  and  partnering  agents  comparable  in  size  in  all  four  states.     Table  2.  Number  of  incoming  and  outgoing  connections  (din  and  dout),  frequency  as  acting  and   partnering  agents  (sin  and  sout)  and  hub  and  authority  scores  of  Governmental  Public  Health   Agencies  and  Hospitals  in  four  states.   Degree  and  Frequencies   GPH  

Hospital  

  St.  

din  

dout  

sin  

sout  

H  

a  

din  

dout  

sin  

sout  

h  

A  

KS  

13  

12  

50  

58  

0.66  

0.41  

7  

8  

14  

16  

0.13  

0.09  

PA  

11  

10  

50  

95  

1  

0.4  

3  

3  

28  

8  

0.08  

0.33  

TX  

18  

20  

174  

189  

1  

0.82  

7  

13  

14  

62  

0.48  

0.13  

WI  

15  

13  

100  

117  

1  

0.55  

4  

4  

9  

18  

0.12  

0.01  

  One   can   also  display  other  important  measurements  at  the  network  level  for   different   states.  The  spread   of   responsibilities   over   all   the   agents   in   the   network   are   reflected   in   the   density   measurements   (δ)   (not   shown.)   Texas   has   one   of   the   densest   networks   with   δ   above   20%.   Pennsylvania   is   among   the   study   states   with  less  dense  networks,  a  natural  result  of  the  low  number  of  connections  and  high  number  of  unconnected   agents.    The  average  degree  for  state  legal  networks  varies  among  states  and  can  also  be  presented  in  tabular   form   (not   shown):   the   highest   value   of   6.71   is   observed  in   Texas.   In   other   words,   in   Texas,   on   average,   the   agents  are  connected  to  almost  7  other  agents.  KS,  PA,  and  WI  have  lower  densities.  The  average  strength  or   the  average  number  of  citations  per  agent  exhibits  a  similar  trend,  as  well.  Average  degree  and  strength  often   are   correlated   with   the   history   of   the   infectious   disease   outbreaks   in   the   state.   The   states   with   more   experience  have  more  laws  and  thus  more  connections  and  citations  for  the  agents.   Finally,   LENA   is   also   an   interface   with   the   Emergency   Law   Database.   Clicking   on   the   links   between   the   nodes   reveals   a   listing   of   legal   citations   containing   all   the   laws   we   analyzed   pertaining   to   either   of   the   agents.     The  Emergency  Law  Database  is  a  searchable  repository  of  all  laws  we  analyzed  for  this  study.  The  laws  are   searchable  by  key  word  (such  as  infectious  or  Ebola)  in  one  or  any  number  up  to  13  jurisdictions.  If  desired,   searches  can  be  limited  to  those  laws  pertaining  to  one  or  up  to  26  enumerated  PHS  agents.     3

3.   Example:   Using   LENA   and   the   Emergency   Law   Database   to   Analyze   Legally   Mandated   Reporting  Requirements  or  “What  Happened  at  the  Emergency  Room  at  Texas  Presbyterian   Hospital?”     The  U.S.  House  of  Representative’s  Energy  and  Commerce  Committee  is  charged  with  investigating  and   overseeing  the  nation’s  response  to  the  Hemorrhagic  Fever  crises  involving  the  Ebola  virus.  The  following  is  a   paraphrase  of  the  Committee’s  summary  of  the  recent  events  occurring  at  Texas  Presbyterian  Hospital.   On   September   25,   2014   at   10:37   p.m.,   Timothy   Allen   Duncan   presented   to   the   Emergency   Department   (ED)  of  Texas  Presbyterian  Hospital  in  Dallas,  Texas.    At  11:36  p.m.,  the  triage  nurse  recorded  Mr.  Duncan’s   chief  complaint  as  “abdominal  pain,  dizziness,  nausea  and  headache  (new  onset)  and  a  temperature  of  100.1F.     Obtaining  a  patient’s  travel  history  was  not  part  of  the  triage  nurse’s  protocol.    A  physician  assessed  the  triage   nurse’s  report  at  12:27  a.m.,  but  did  not  begin  his  examination.  The  primary  registered  nurse  continued  the   patient   assessment   from   12:33-­‐12:44   a.m.     She   asked   Mr.   Duncan’s   travel   history   and   documented   that   he   “came  from  Africa  9/20/14.”  No  further  significance  was  attributed  to  his  point  of  origin.  Despite  a  prompt  by   the  electronic  health  record  (EHR),  the  travel  information  was  not  verbally  communicated  to  the  physician.   The  physician  accessed  the  EHR  again  between  12:52-­‐1:10  a.m.,  including  the  portions  containing  the  travel   history.  Over  the  next  two  hours,  the  physician  accessed  the  EHR  several  times  to  review  test  results,  and  Mr.   Duncan’s   responses   to   treatment.   Mr.   Duncan’s   temperature   spiked   to   103.0F   at   3:02   a.m.,   later   fell   to   101.2F   and,  at  3:37  a.m.,  he  was  discharged  with  a  diagnosis  of  sinusitis  and  abdominal  pain.  On  September  28,  2014,   Mr.   Duncan   returned   to   the   hospital   and   was   admitted.   It   is   unclear   when   and   how   the   Texas   Department   State  Health  Services  was  notified  that  Mr.  Duncan’s  symptoms  were  consistent  with  Ebola.  According  to  the   Centers  for  Disease  Control  and  Prevention  (CDC),  Texas  Presbyterian  Hospital  isolated  the  patient  and  sent   specimens   for   testing   at   the   CDC   and   at   a   Texas   lab   participating   in   the   CDC’s   Laboratory   Response   Network.     On  October  8,  2014,  Mr.  Thomas  Allen  Duncan  became  the  first  person  to  die  on  U.S  soil  from  Ebola.         We   used   both   LENA   and   the   Emergency   Law   Database   to   answer   the   question,   “What   Texas   laws   require   interaction  between  hospitals  and  governmental  public  health  entities?”  This  question  arises  from  the  lapse   of  communication  apparent  from  the  events,  summarized  above,  which  suggest  that  the  hospital  emergency   department  staff  had  not  been  sufficiently  alerted  to  the  possibility  of  seeing  an  Ebola-­‐exposed  patient.   As   is   shown   in   Figure   1,   Governmental   Public   Health   is   the   most   central   agent   in   the   Texas   epidemic   emergency  network.  The  link  between  Governmental  Public  Health  and  Hospitals  is  red  indicating  that  some   of   the   functions   between   the   two   agents   are   instigated   by   Governmental   Public   Health,   and   others   are   initiated  by  Hospitals.  When  the  link  between  the  agents  is  clicked,  119  laws  are  cited  in  the  Emergency  Law   Database.    Under  25  TAC  §  97.8,  the  Commissioner  of  Health  and  the  Health  Authority  may  investigate  causes   of  infectious  diseases  and  verify  diagnoses.  Information  concerning  the  diseases  must  be  given  to  the  patient   but  there  is  no  requirement  that  such  information  be  shared  with  the  hospital.   In   order   to   narrow   the   search,   the   terms   “reportable”   and   “diseases”   were   entered   as   key   words,   and   Texas   was   selected   as   the   jurisdiction.     Twenty-­‐four   citations   were   revealed   by   this   search.   Three   of   the   citations  were  relevant  to  our  study  question:  25  TAC  §§  97.2,  97.3  and  97.4.  These  laws  require  hospitals  in   Texas   to   notify   the   local   or   regional   health   department   by   phone   immediately   after   identifying   a   case   of   Viral   Hemorrhagic   Fever.   Additionally,   25   TAC   §   97.3   refers   hospitals   to   a   Department   of   State   Health   Service’s   website   for   further   guidance.   Further,   although   there   may   be   policies   in   place   addressing   reporting   issues,   there  do  not  appear  to  be  any  requirements  under  Texas  law  mandating  that  healthcare  providers  verbally   share   information   concerning   a   patient’s   travel   history   or   that   the   Texas   Department   of   State   Health   Services   requires  hospitals  to  delve  into  a  patient’s  point  of  origin.  Finally,  none  of  the  twenty-­‐four  citations  directed   communication   from   governmental   public   health   to   hospitals   concerning   an   ongoing   health   alert   (such   as   Ebola  was  at  the  time).       As   explained   above,   the   epidemic   emergency   networks   created   in   some   of   the   other   jurisdictions   look   remarkably   different   from   that   in   Texas.   Because   of   these   noted   variances,   legal   searches   were   conducted   using  the  Emergency  Law  Database  in  Kansas,  Pennsylvania  and  Wisconsin  to  see  whether  these  states  had   reporting   laws   for   suspected   cases   of   Ebola   that   were   different   from   those   established   in   Texas.   A   comparative   search   such   as   this   can   reveal   critical   information   for   policy   makers   and   legislators   interested   in   amending   laws   to   reflect   best   practices   and   increase   public   health   safety.   In   Kansas,   any   exotic   disease   and   other   infectious   diseases   posing   as   a   public   health   threat   and   constituting   a   risk   to   the   public   health   (presumably,   these   requirements   encompass   Ebola)   must   be   reported   by   the   administrator   of   a   hospital   to   the  Secretary  of  Health  and  Environment  within  4  hours  (KAR  28-­‐1-­‐2  and  28-­‐1-­‐4).    In  Wisconsin,  any  illness   4

caused  by  an  unusual,  foreign,  or  exotic  agent  that  with  public  health  implications  must  be  reported  to  a  local   health  officer  by  a  health  care  facility  immediately,  and  to  the  National  Electronic  Disease  Surveillance  System   within   24   hours.   In   Pennsylvania,   the   timeframe   within   which   reporting   must   occur   is   considerably   longer.   Viral   Hemorrhagic   Fever   must   be   reported   by   a   health   care   facility   to   the   local   morbidity   reporting   office   within  24  hours  (28  Pa.  Code  27.21a).       Figure  1.  The  Legally  Direct  Network  in  Texas  for  Epidemic  Emergencies  

  4.  Related  Work   Network  analysis  has  been  applied  to  co-­‐occurrence  and  citation  patterns  in  the  US  [2]  and  French  legal  codes   [11]   as   well   as,   to   communications   in   legal   contexts   such   as   interpreting   emails   in   e-­‐discovery   [8]   and   in   public  health  to  model  communication  patterns  of  workers  within  local  health  departments  [12].   Network  analysis  has  supported  corpus-­‐based  inferences  about  legal  regulations.  The  system  in  [9]  helps   answer   questions   such   as,   “What   is   the   most   important   or   influential   regulation   in   the   Netherlands?"   by   analyzing  the  network  of  co-­‐citations  between  the  interconnected  Web  resources  (URIs)  associated  with  the   legal   regulations   in   the   MetaLex   Document   Server   (MDS).   [17]   also   employed   citation   network   analysis   in   order   to   determine   the   most   influential   regulations   in   a   corpus   of   hundreds   of   legislative   documents   represented   in   HTML   format   from   which   rules   extracted   citation   information.   [7]   employed   a   finer   grained   interlinking  of  statutory  law  in  a  sources-­‐network-­‐of-­‐law  (SNL)  at  the  paragraph-­‐article  level.   A  number  of  approaches  have  been  applied  to  integrate  network  analysis  in  legal  information  retrieval.  In   [22]   a   network   of   special   “legal   issues”,   extracted   from   the   corpus,   could   assist   users   in   retrieving   case   materials.  Each  legal  issue  is  a  “statement  of  belief,  opinion,  a  principle,  etc.,”  usually  containing  one  or  more   legal  concepts,  that  has  a  legal  implication  (e.g.,  “Thirteen-­‐year-­‐olds  should  not  own  a  vehicle.”)  mined  from  a   full-­‐text  case  law  database.  When  cases  are  connected  by  citations,  a  legal  issue  is  the  proposition  for  which   the  case  is  cited.  The  system  in  [13]  retrieved  documents  based  on  queries  containing  semantic  descriptors   5

and   indicators   of   cross-­‐referential   relations   between   documents   (e.g.,   "Which   orders   talk   about   abnormally   annoying   noise   …   and   make   reference   to   decrees   talking   about   soundproofing…   ?").   In   a   legal   information   retrieval   application,  [21]   address   how   to   automatically   determine   a   context   of   laws   to   display   to   a   user   of   an   online  hyperlinked  legislative  database  relevant  to  the  particular  legal  article  the  user  retrieved.    For  a  small   corpus  comprising  two  articles  of  the  Dutch  “foreigners  law”  the  researchers  defined  “context  networks”  for   each  article  comprising  a  selection  of  all  incoming  references  to,  and  all  outgoing  references  from,  the  article   in  focus  based  on  a  weighting  scheme  favoring  references  that  are  outgoing,  not  internal  references,  referring   to  definitions  in  prior  articles,  recently  changed,  or  having  a  high  degree  of  network  centrality.     Our  project  takes  a  different  approach  to  integrating  network  analysis  and  legal  information  retrieval.  In   our  networks,  links  represent  relationships  between  participants  in  a  state’s  public  health  system.  Our  system   helps  users,  such  as  public  health  agents  working  in  field  offices,  answer  questions  such  as,  “What  regulations   establish  the  relationships  between  government  public  health  agencies  and  hospitals?"  The  network  diagrams   serve   as   visual   indexes   into   the   legal   information   database.   When   a   user   clicks   on   a   link,   the   system   retrieves   the  specific  statutory  provisions  that  establish  the  relationship  between  participants  and  that  justify  the  link   between   the   nodes.     In   related   work   [14,   6]   machine   learning   text   classification   is   employed   to   automate   identifying   the   relevant   statutory   provisions   that   set   up   relevant   relationships   among   participants   and   extracting   the   relevant   features   so   that   networks   can   be   constructed   for   new   states’   public   health   systems.   In   addition,   our   networks   support   comparing   different   states’   systems   both   visually   and   in   terms   of   analytical   network  measures  (i.e.,  density,  inclusiveness,  degree,  strength,  reciprocity,  hub  and  authority)  [16].  

5.  Conclusions     Traditional   legal   analysis   is   a   tool   for   interpreting   statutes   and   regulations,   but   it   is   not   particularly   well   suited  to  measuring  relationships  within  systems.  Such  measurements  are  in  the  domain  of  network  analysis.   Network   analysis   can   describe   network   characteristics,   analyze   complex   patterns   of   relationship,   and   produce   results   graphically.   Previous   studies   have   applied   network   analysis   to   many   areas,   however,   quantitative  network  analysis  has  never  been  applied  previously  to  legal  preparedness.  In  addition,  this  work   supplements  network  analysis  with  statutory  information  retrieval.   LENA   and   the   Emergency   Law   Database   provided   useful   analytic   tools.   The   lack   of   a   legal   directive   specifying  that  governmental  public  health  in  Texas  should  notify  hospitals  about  an  infectious  disease  alert   appears  relevant  to  the  lapse  of  communication  which  resulted  in  hospital  personnel  discharging  the  Ebola-­‐ exposed  patient  from  its  emergency  department.  This  finding  is  not  by  itself  conclusive  but  rather  triggers  the   need  to  explore  further  how  to  optimize  communication  between  these  two  agents.       Several  aspects  of  this  study  give  rise  to  limitations.  First,  for  many  practical  purposes,  traditional  legal   analysis  must  be  done  to  discern  the  substantive  content  of  legal  directives.  The  network  methods  described   here   should   be   considered   as   a   complement   to—rather   than   a   substitute   for—traditional   legal   analysis.     Second,  statutes  and  regulations  undergo  frequent  revision,  so  the  results  presented  here  should  be  viewed   as   a   snapshot   of   the   samples   states’   infectious   disease   response   networks.   Third,   the   laws   analyzed   in   this   study   do   not   necessarily   describe   the   actual   conduct   of   agents   in   infectious   disease   situations,   nor   do   they   necessarily  describe  fully  the  response  networks  that  may  arise  in  practice.     Network   analysis   provides   quantitative   measures   and   graphic   visualizations   for   complex   patterns   embedded  in  infectious  disease  emergency  and  preparedness  laws,  facilitating  interpretation  and  comparison   by   non-­‐lawyers.   This   analysis   of   infectious   disease   emergency   networks   as   defined   by   laws   from   sample   states   and   the   federal   government   draws   attention   to   issues   for   lawmakers,   emergency   planners,   and   researchers.   State-­‐level   lawmakers   and   planners   might   consider   whether   greater   inclusion   of   agents,   modeled   on   the   federal   government,   would   enhance   their   infectious   disease   emergency   response   laws   and   whether   more   agents   should   be   engaged   in   planning   and   policy   making.   Further   research   should   explore   whether   and   to   what   extent   legislated   infectious   disease   emergency   directives   impose   constraints   on   practical   response   activities   including   emergency   planning.   Also,   coding   plans/policies   and   actual   communication  patterns  among  agents  and  combining  it  with  legal  networks  would  shed  more  light  on  the   dynamics  of  emergency  preparedness  and  response.2    Acknowledgements:  The  authors  thank  the  following  contributors:  Jessica  Kanzler,  Amy  Anderson,  Pamela  Day,  Karen  Marryshow,   Geoffrey  Mwaungulu,  and  Ryan  Stringer.  This  work  was  funded  through  the  Center  for  Public  Health  Practice  by  the  Centers  for  Disease   Control  and  Prevention  (cooperative  agreement  5P01TP000304).  Its  contents  are  solely  the  responsibility  of  the  authors  and  do  not   necessarily  represent  the  official  views  of  the  Centers  for  Disease  Control  and  Prevention.  This  work  was  also  supported  by  the  National   Institute  of  General  Medical  Sciences  MIDAS  grant  1U54GM088491-­‐01,  which  had  no  role  in  study  design,  data  collection  and  analysis,   2

6

 

6.  References   [1]  Batagelj  V,  Mrvar  A.  Pajek  -­‐  Program  for  Large  Network  Analysis.  Connections  1998;21(2):47-­‐57.  Available   at:  http://pajek.imfm.si.  Accessed  March  28,  2014.   [2]  Bommarito  MJ,  Katz  DM.  A  mathematical  approach  to  the  study  of  the  United  States  Code.  Physica  A  2010;   389:4195-­‐4200.   [3]  Borgatti  SP.  Structural  holes:  Unpacking  Burt’s  redundancy  measures.  Connections  1997;  20(1),  35-­‐38.   [4]  Csardi  G,  Nepusz  T.  The  igraph  software  package  for  complex  network  research.  InterJournal,  Complex   Systems  2006:1695.   [5]  Freeman  LC.  Centrality  in  social  networks:  I.  conceptual  clarification.  Social  Networks  1979;1:215–239.   [6]  Grabmair,  M.,  Ashley,  K.,  Hwa,  R.  &  Sweeney,  P.  “Toward  Extracting  Information  from  Public  Health   Statutes  using  Text  Classification  and  Machine  Learning”  Jurix  2011  Proc.  73-­‐82  (Atkinson,  K.  ed.)  (2011).   [7]  Gultemen,  D.  and  T.  van  Engers  (2013)  Graph-­‐Based  Linking  and  Visualization  for  Legislation  Documents   (GLVD)  in  R.  Winkels  (ed.)  2013.  pp.  67-­‐80.   [8]  Henseler,  H.  Network-­‐based  filtering  for  large  email  collections  in  E-­‐Discovery,  Artificial  Intelligence  and   Law  18(4):413-­‐430  (2010).   [9]  Hoekstra,  R.  (2013)  A  Network  Analysis  of  Dutch  Regulations  -­‐  Using  the  MetaLex  Document  Server.  in  R.   Winkels  (ed.)  2013.  pp.  47-­‐58.   [10]  Kleinberg  JM.  Authoritative  sources  in  a  hyperlinked  environment.  J.  of  the  ACM  1999;46  (5):  604–632   [11]  Mazzega  P,  Bourcier  D,  Boulet  R.  The  network  of  French  legal  codes.  In:  Proceedings  of  the  12th   International  Conference  on  Artificial  Intelligence  and  Law  (ICAIL):  236–237,  New  York,  NY:  ACM;  2009.   [12]  Merrill  J,  Caldwell  M,  Rockoff  ML,  Gebbie  K,  Carley  KM,  Bakken  S.  Findings  from  an  organizational   network  analysis  to  support  local  public  health  management.  J  Urban  Health  2008;85(4):572-­‐84.   [13]  Mimouni,  N.,  M.  Fernandez,  A.  Nazarenko,  D.  Bourcier  &  S.  Salotti  (2013)  A  Relational  Approach  for   Information  Retrieval  on  XML  Legal  Sources.  In  Proc.,  ICAIL  2013.  pp.  212-­‐216.  ACM  Press,  New  York.   [14]  Savelka,  J.,  K.  Ashley  &  M.  Grabmair  Mining  Information  from  Statutory  Texts  in  Multi-­‐Jurisdictional   Settings.  in    R.  Hoekstra  (Ed.)  Proc.,  27th  Ann.  Conf.  on  Legal  Knowledge  and  Inf.  Sys.  (Jurix  2014).   [15]  Scott  J.  Social  Network  Analysis:  A  handbook,  3rd  ed.  London,  UK:  Sage  Publications;  2013.   [16]  Sweeney,  P.,  E.  Bjerke,  M.  Potter,  H.  Guclu,  C.  Keane,  K.  Ashley,  M.  Grabmair  &  R.  Hwa.  (2013)  Network   Analysis  of  Manually-­‐Encoded  State  Laws  and  Prospects  for  Automation.  in  R.  Winkels  (ed.)  2013.  pp.  29-­‐ 36.  Expanded  in  R.  Winkels,  N.  Lettieri,  &  S.  Faro  (eds.)  Network  Analysis  in  Law,  Collana:  Diritto  Scienza   Tecnologia/Law  Science  Technology  –  Temi,  3,  Edizione  Scientifiche  Italiane.   [17]  Szőke,  A.,  K.  Mácsár  &  G.  Strausz  (2013)  A  Text  Analysis  Framework  for  Automatic  Semantic  Knowledge   Representation  of  Legal  Documents.    in  R.  Winkels  (ed.)  2013.  pp.  59-­‐66.   [18]  Wasserman  S,  Faust  K.  Social  Network  Analysis:  Methods  and  Applications;  Cambridge,  MA:  Cambridge   University  Press;  1994.   [19]  Watts  DJ,  Strogatz  SH.  Collective  dynamics  of  'small-­‐world'  networks.  Nature  1998;393(6684):409–10.   [20]  Winkels,  R.  (ed.)  (2013)  Proc.  1st  Workshop  on  Network  Analysis  in  Law,  ICAIL  2013   http://www.leibnizcenter.org/~winkels/NAiL2013Proceedings.pdf.  (visited  29  October  2014).   [21]  Winkels,  R.  &  A.  Boer.  2013.  Finding  and  Visualizing  Context  in  Dutch  Legislation.  in  in  R.  Winkels  (ed.)   2013.  pp.  23-­‐29.   [22]  Zhang,  P.,  H.  Silver,  M.  Wasson,  D.  Steiner  and  S.  Sharma  (2013)  Knowledge  Network  Based  on  Legal   Issues.  in  R.  Winkels  (ed.)  2013.  pp.  23-­‐2  

 

decision  to  publish,  or  preparation  of  the  manuscript.  

7

Document-enhancement and network analysis for criminal investigation: an holistic approach Nicola Lettieri1, Delfina Malandrino2, Luca Vicidomini2 1 2

ISFOL/University of Sannio, Italy Department of Computer Science, University of Salerno, Italy

Abstract The last decade has witnessed a growing interest towards the use of Social Network Analysis techniques in the study of criminal organizations both for scientific and investigation goals. Criminal Network Analysis is nowadays a well-established interdisciplinary research area in which criminology, Social Network Analysis and computer science converge with other disciplines to unveil hidden patterns in growing volumes of crime data. A relevant issue, in this scenario, is represented by the variety of the information used to build and analyze criminal networks. In most cases the analysis is performed “ex post” over data belonging to single categories (e.g., phone calls among criminals). There are still few experiences exploiting at the same time different data type (criminal records, wiretapping, environmental tapping) and bringing network analysis techniques within the phase of judicial investigation in which relevant information are produced. In this paper we present an holistic approach that exploits documentenhancement network analysis and visualization techniques to support public prosecutors and police in increasing their understanding of structural and functional features of criminal organizations.

Keyword: Criminal Network Analysis, Document-Enhancement, Visual Information Retrieval.

1.

Introduction

The last decade has witnessed a growing interest towards the use of Social Network Analysis (SNA) techniques in the study of criminal organizations both for scientific and investigation goals. Sociality has a tremendous influence on crime: a large part of criminal phenomena from pornography trafficking to hacking and other cybercrimes, is strongly conditioned (inhibited or facilitated) by relational dynamics. Criminals are often highly social actors: they communicate among them, collaborate and form groups in which it is possible to distinguish leaders, subcommunities and actors with very different roles. Therefore, SNA techniques seem to be perfectly suited to the needs of organized crime studies. This task is facilitated by the growing amount of digital information spanning from e-mails, to credit-card operations today available. The data deluge and the availability of increasingly advanced data mining techniques are

1

offering both researchers and law enforcement agencies new tools and methodologies to unveil and understand structures and dynamics of criminal networks. Criminal Network Analysis (CNA) is today a well-established interdisciplinary research area (Morselli, 2009) in which network analysis techniques are employed to analyze large volume of relational data and gain deeper insights about the criminal network under investigation (a brief review of existing works in this field is presented in Section 2). Two important issues arise in this field, the first one is represented by the fact that very often tools used by investigators do not allow to collect properly all the information needed for network analysis. The second one is connected with the variety of the information used to build and analyze criminal networks. In most cases the analysis is performed “ex post” over data belonging to single categories (e.g. phone calls among criminals). There are still few experiences exploiting at the same time different data type (criminal records, wiretapping, environmental tapping) and bringing network analysis techniques within the phase of judicial investigation in which relevant information are produced. In this paper we present an holistic approach that, taking cue from previous works (Lettieri et al. 2014) aiming at combining document-enhancement methods, network analysis principles and visualization techniques to support public prosecutors and police in exploiting the investigative advantages of network analysis. After a brief introduction and a general overview of interesting works in this field, we describe, both from a technical and methodological point of view, the prototype of a framework supporting the proposed approach. Finally we conclude with some final remarks and future directions.

2. Criminal Network Analysis research: a state of the art There are many examples of application of the principles of SNA in the analysis of criminal organizations. Among the most interesting works we can cite (Stewart, 2001; Xu, 2005a). Specifically, in (Xu, 2005a) authors state that effective use of SNA techniques to mine criminal network data can have important implication for crime investigations, mainly because they can aid enforcement agencies fighting crime proactively, for example by intervening when a crime takes place and also with the required police efforts. Specifically, they developed CrimeNet Explorer (Xu, 2005b), a system that, in addition to the visualization functionality, detects subgroups in a network, discover interaction patterns between groups, and identify central members in a network. As also discussed in (Hutchins, 2010), the most common challenge to address when designing and implementing a new software system is the ability to use information and document from many sources. Specifically, data can be structured and unstructured, whereas the latter representing the most important barrier for researchers that want to develop a new system for the domain experts. Additionally, it must be emphasized that data can be available from many domain sources as well as from many papers and electronic sources.

2

Authors in (Hutchins, 2010) address this challenges by proposing HIDTASIS, a system software that supports both qualitative (visual) and quantitative (mathematical) analysis of criminal network data. Similar to other analyzed works the proposed system only apply SNA metrics to analyze the structure of the criminal network in order to derive useful insights about roles' entities and communication patterns. After 9/11, social network experts began to look explicitly at the use of network methodology in understanding and fighting terrorism. Some important works include the analysis performed by (Krebs, 2001) that collected publicly available data on the Al-Qaeda hijackers and applied the social network metrics to derive useful insights about the connections among the terrorists and their roles in the network (Muhammad Atta was identified as the key leader of the network). A similar analysis was performed by taking into account newspaper articles and radio commentary (Rothenberg, 2001). Additional works studied the potential use of SNA to destabilize terrorist networks (Memon, 2006). An interesting work in the field of criminal network analysis was recently proposed by (Ferrara, 2014). The authors present LogAnalysis, an expert system specifically designed to allow statistical network analysis, community detection, and visual exploration of mobile phone network data. All the analyzed works propose systems that are able to produce a graphical representation of criminal network and to provide a structural analysis functionality to facilitate crime investigations. They exhibit however a relevant limit: they are based on data achieved mainly outside the investigations process with two effects: a lack of relevant information and, as a consequence, a semantic weakness of graphs taking into account a single category of data only (i.e., phone calls). It is worth to note that, the existence of social connections and relations among criminals can be more efficiently inferred by taking into account other knowledge assets from who is conducting the investigation activities (i.e., police records). The perspective to apply the same network analysis metrics on a map much more accurate and semantically richer (produced by the prosecutors and judicial police starting from different factual evidences like police records, physical position of individuals, or even the contents of communications/interviews etc) opens up the horizon of a greater understanding of the phenomenon under investigation.

3. CNA: some research issues As shown in the brief survey conducted in Section 2, the application of network analysis techniques can greatly improve the understanding of the structural and functional characteristics of criminal organizations. Even the analysis of graphs built on “semantically poor” data (e.g. the mere call logs occurred between persons intercepted without the transcript of the contents thereof) allows to extract important information about the role and to 'importance of individuals within the criminal organization (leader, hubs, etc.) If the results obtained on the basis of this kind of information are already interesting, much more can be obtained by enriching in semantic terms the information on which graphs are generated.

3

For example, community detection algorithms could lead to more realistic and accurate results using not only basic (e.g. number and direction of calls between two parties), but also other types of relational information (e.g. the fact that people involved in the communication share the same kind of criminal records). A significant problem in this context is the lack of structured information enabling, in an effective way, the application of network analysis metrics. Information needed to this aim (e.g., criminal record, dialogues transcript etc.) are produced during the investigative activity by public prosecutors and police officers using tools (mainly traditional text editor) that neither offer scaffolding support nor allow a drafting workflow producing the structured information needed for visualization and SNA. This implies the loss of a wealth of (often implicit relationships between data) knowledge potentially useful for the study of social relations within the organization criminal. A survey conducted in this vein, by interviewing Italian prosecutors and police officers, highlighted in set of critical issues ranging from typos to mere lack of metadata. Some relevant examples include: a) Inability to create and maintain structured, connected and reusable information Standard text editors, normally used to draft criminal proceedings and reports are designed to build “flat” documents missing any kind of contextual or semantic markup. Furthermore, a large part of relevant data for the identification and the analysis of criminal relationships (e.g., the fact that two individuals under investigation share the same criminal record) are contained in different documents, collected in the database not connected to each other. A further complexity lies in the fact that the relational information is produced by different subjects (public prosecutors and police officers) normally involved in the same investigation. b) Data incorrectness Incorrect data regarding criminals’ identity or attributes may result either from unintentional data entry or errors in repetitions (e.g. the same person/criminal is pointed out with slightly different names or with nicknames in different documents). The risk of wrong data entries grows exponentially as the number of individuals under investigation increases. c) Lack of synchronous collaboration Investigative proceedings containing relational information like remand documents are created by different people collaborating on the same document in an asynchronous way (e.g., exchanging files via email). Conversely, synchronous collaboration tools would allow better coordination, mutual support, knowledge sharing with significant advantages for the investigation activity. According to the scenario so far described, to address the above mentioned issues, it is strongly needed an “holistic” approach combining different features: a) workflow management; b) document-enhancement; c) visualization d) criminal network analysis; e) visual information retrieval techniques (Zhang 2008, Didimo et al. 2014). The CrimeMiner framework is the result of our first attempt to embed in a unique tool all these functionalities. We will describe each of them in the following Section.

4

4.

A Framework for Criminal Network Analysis

In this Section we describe CrimeMiner, the prototype implementing our holistic approach. After a brief analysis of its architecture, we will focus on its main functionalities. 4.1. Architecture From the technical point of view, the system architecture reflects the MVC (Model-ViewController) design pattern. Specifically, as shown in Figure 1, CrimeMiner is composed of three main components, and we will discuss each of them in the following.

Figure 1: CrimeMiner software architecture ● ● ●

the Database component stores data managed by CrimeMiner such as general settings, criminal records, phoneand environmental tapping, and so on; the Modules component contains the business logic and interacts with the database layer; the User interface (GUI) component contains the editor (MainView) and the graph (ShowGraph) windows, as well as other interfaces the user can interact with. Its content are extracted from underlying layers.

Each layer leverages on the underlying ones only, thus reducing redundant code or tightly coupled software components. The current implementation of CrimeMiner works with a local installation of MySql 5.6 database1. MySql is a free and open software chosen given its widespread popularity. By 1 2

http://www.mysql.com/ http://hibernate.org/

5

leveraging the modularity and flexibility characteristics of CrimeMiner, developers can easily add support to different databases in the next future. Database is accessed via Hibernate2, a free and open source library that abstract the data access from the database implementation, by facilitating the data management. The Graph module contains the classes that deal with the graph: “Generation”, “Update”, “Filters”, “Metrics”, “Statistics” and so on. It exploits the Gephi library3, an “interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs” (Bastian, 2009). The Utilities module contains useful, reusable functions and objects such as a set of regular expression checkers, an improved auto-completing text field for user interface, and so on. The Editor module handles the editing features in CrimeMiner: users can edit text, choose formatting (font, colors, sizes, etc…). The module is based on the FreeMarker template engine to retrieve data from the database and insert it into the document. The Config module contains the classes that manages the application configuration (e.g., user preferences, settings, database configuration). The User interaction is managed via two interfaces: the MainView is the entry point of CrimeMiner, where users can edit the document and update information about people, taps, criminal records and other elements that will be stored in the database. When accessing the GraphPreview interface, CrimeMiner automatically builds a graph based on its internal database. Each individual in the database becomes a node of the graph; then when a couple of people is involved in the same tap, an edge is established between their corresponding nodes. 4.2. Functionalities The main functionalities provided by CrimeMiner include Workflow management, Documentenhancement, Social Network Analysis, and Visualization facilities. We will describe each of them in the following. a) Workflow management The system supports domain experts’ scaffolding and monitors all the activities needed to proceed from drafting to graph visualization and criminal network analysis. People working on investigative activities will benefit of this holistic approach being guided through the whole process minimizing the errors and maximizing the effectiveness. b) Document-enhancement The Document-enhancement functionality aims at overcoming the limitations often affecting data entry in criminal network analysis settings: incompleteness, incorrectness, and inconsistency (Xu 2005a). To this aim we provided an advanced editor (shown in Figure 2) that, in addition to the traditional word-processing facilities (i.e., drafting, formatting etc.), allows the creation of a database containing all relevant information for criminal network analysis (i.e., police records, police reports and so on). This user-friendly instrument allows to prepare the 2 3

http://hibernate.org/ http://gephi.github.io/

6

document with all the metadata that will be next used for the visualization and analysis process. Our prototype works as a standard text-editor, but also supports insertion of people’s details and tappings (wiretapping and phone tappings).

Figure 2: Document editing with enhancement commands (on the right) c) Visualization CrimeMiner allows to generate on-the-fly a snapshot of the criminal network under investigation according to data available in the database. As we will see, graphs will be used both for network structural and functional analysis and for information retrieval. Figure 3 shows a graph generated using information about wiretappings on a real investigation (names are hidden because of privacy issues). The color of a node varies according to its degree, while its size is proportional to the closeness centrality metric. An edge between two people/nodes is as thicker as more often those two people are involved in conversations and phone calls.

7

Figure 3: Visualization and analysis of a criminal network d) Criminal Network Analysis The tool allows the application of the main Social Network Analysis metrics and methods in order to ease study the characteristics of the criminal organization and to identify the role of single individuals within it. The current implementation supports the following metrics/methods: ● ● ● ● ●

Degree centrality (Brandes 2001) Betweenness centrality (Brandes 2001) Closeness centrality (Brandes 2001) Clustering coefficient Community detection (Blondel 2008)

It is also possible to apply filters to the graph. Very simple implementations are currently embedded into the tool: for example it is possible to set a threshold on the nodes’ degree or visualize only people living in a certain geographical area. In the near future we aim to expand our set of filters in order to consider other metadata, such as accusations and time of the tapping.

8

e) Visual Information Retrieval This functionality, currently under development, will allow visual browsing through CrimeMiner database, offering the possibility to access the data belonging both to criminal network and its members by simply interacting with the corresponding graph.

5. Conclusions and future works The work presented in this paper is the first step towards the integration of network analysis techniques in the development of investigative activities. Functionalities so far implemented were discussed with prosecutors personally involved in the fight against organized crime and seem to represent a good starting point to exploit network analysis techniques in the investigative process. Ongoing and future works will cover different aspects. The first one concerns enhancements in the document drafting. We are planning to embed templates for each of the different documents created by domain experts during the investigations. Additional efforts will devoted to the improvements of the User Interface. Another relevant goal is to enhance the analysis allowed by CrimeMiner. On the one hand we will take into account available and not yet used metadata (e.g. residence and activity area, criminal records, family ties among individuals). On the other hand we will explore and compare the effectiveness of different algorithms in supporting crucial issues of the analysis like community detection. The implementation of more advanced visual information retrieval functionalities is another important aspect for people involved in the investigations, while the support for online synchronous collaboration will be extremely useful for public prosecutors and police officers. Finally, we plan to perform an in-depth usability study with domain experts chosen to evaluate the system. Our main goal is to test both the easiness and the usefulness of the proposed system. We want also to analyze the users’ acceptance and, therefore, their behavioral intention to use the developed system.

References Bastian, M., Heymann, S., and Jacomy, M. (2009), “Gephi: an open source software for exploring and manipulating networks”. ICWSM, 8, 361-362. Blondel, V. D, Guillaume, J., Lambiotte R., and Lefebvre, E. (2008), “Fast unfolding of communities in large networks”. Journal of Statistical Mechanics: Theory and Experiment, Vol. 2008, No. 10. Brandes, U. (2001), “A Faster Algorithm for Betweenness Centrality”. Journal of Mathematical Sociology 25(2):163-177. Didimo, W., Liotta, G., and Montecchiani, F. (2014). “Network visualization for financial crime detection”. Journal of Visual Languages & Computing, 25(4):433-451.

9

Ferrara, E., De Meo, P., Catanese, S., and Fiumara, G. (2014). “Detecting criminal organizations in mobile phone networks”. Expert Systems with Applications 41(13):5733 - 5750. Hutchins, C. E., and Benham-Hutchins, M. (2010). “Hiding in plain sight: criminal network analysis”. Computational and Mathematical Organization Theory, 16(1), 89-111. Krebs, V. (2001). “Mapping Networks of Terrorist Cells”, Connections, 24(3), 43-52. Mastrobuoni, G., and Patacchini, E. (2012). "Organized Crime Networks: an Application of Network Analysis Techniques to the American Mafia". Review of Network Economics: Vol. 11: Iss. 3, n. 10. Lettieri, N., Malandrino, D., Rinaldi, C. and Spinelli, R. (2014). “From Structure to Function. Exploring the Use of Text and Social Network Analysis in Criminal Investigations”. In Winkels, R., Lettieri N., Faro, S. Network Analysis in Law, Naples, ESI, 79-94. Morselli, C. (2009). “Inside Criminal Networks”. Studies of Organized Crime, Vol. 8. Memon, N. and Larsen, H. (2006). Practical algorithms for destabilizing terrorist networks. In Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics (ISI'06), 389-400. Rothenberg, R. (2001). “From whole cloth: making up the terrorist network”, Connections, 24(3): 36-42 Stewart, T. A. (2001). “Six Degrees of Mohamed Atta”, Business 2.0. Sutherland E. H. (1947). “Principles of Criminology”. Chicago: J.B. Lippincott. Xu, J., and Chen, H. (2005a). “Criminal Network Analysis and Visualization”, in Communications ACM, 2005, 48(6), pp. 100-107. Xu, J., and Chen, H. (2005b). “CrimeNet explorer: a framework for criminal network knowledge discovery”. ACM Trans. Inf. Syst. 23, 2 (April 2005), 201-226. Zhang, J. (2008), “Visualization for Information Retrieval”, Berlin: Springer, p. 16.

10

The Case Law of the Italian Constitutional Court between Network Theory and Philosophy of Information Tommaso Agnoloni1, Ugo Pagallo2 1

Institute of Legal Information Theory and Techniques ITTIG - CNR Via de’ Barucci 20, Firenze, Italy [email protected] 2 Torino Law School, University of Torino, Lungo Dora Siena 100, 10153 Torino, Italy [email protected]

1. Introduction Over the past decade work on network analysis and the law has become increasingly popular among scholars. This research includes (not only but also) jurisprudence, legislation and how academic studies quote each other. Here, focus is on the case law of the Italian Constitutional Court and its own citation network. Section 2 explains how we built this latter network, so as to illustrate some of its most relevant properties. Section 3 deepens these results in terms of information. By distinguishing three levels of analysis, i.e. legal information “for” reality, “as” reality, and “about” reality, special attention is drawn to matters of knowledge and concepts that inform us about the different states of the world and frame the representation and function of the legal system which is under scrutiny. The conclusions indicate the next steps of our research. 2. Construction of the constitutional case law citation network In 2013 the Italian Constitutional Court released as open data the complete datasets of the rulings it has delivered from its origin in 1956 onwards. According to the open data principles, data are released in open format (XML) and with an open license (CC-BY-SA3.0). Documents are annotated with a rich set of machine readable metadata of different nature (bibliographic, temporal, semantic, legal). Furthermore, the availability of the entire documental corpus allows the application in bulk of processing tools to plain texts for additional information extraction and dataset enrichment. The construction of the citation network of constitutional decisions requires the proper identification of the rulings (the nodes) and the explicit annotation of the judicial references among them (the edges), which are not available in the released dataset. The identification of legal sources in general, and judicial documents in particular, is a basic issue in the overall eJustice framework so as to improve the efficiency of justice administration through the construction of interoperable legal information systems. At the European level the Case Law Identifier (“ECLI”)-initiative has tackled the problem through the definition of the ECLI identification standards. The aim is to unequivocally identify the decision of any court in every EU member state through the adhesion of national jurisdictions to a common identification standard.

The identifier is composed of five fields in the following order: “ECLI” abbreviation, country code (IT for Italy), court code, year of the decision, and the unique ordinal number of the decision, all separated by a colon (“:”). In order to compose the ECLI of the decisions of the Italian Constitutional Court, their metadata have been serialized, so that each decision has its proper standard identifier. Given the authority code for the Constitutional Court (COST) and the codes describing the type of decision, used as prefix to the decision number (S for Judgement, O for order, D for decree), the ECLI is composed as follows: ECLI:IT:COST:{year}:{decision type}{number} On this basis, ECLI offers the id of the nodes in the network, whilst edges are established by judicial citations. These are hidden in the documents as textual references: their explicit annotation requires extraction through linguistic processing tools applied to the texts. Following the introduction of ECLI, Prudence, a reference parser for Italian case law, has been developed in Bacci et al. (2013). The aim is to extract judicial citations from plain text and serialize them in the ECLI standard format. Originally developed for the extraction of judicial citations in the civil case law of first instance, the parser has been adapted to the texts of constitutional judgements, so as to cover more lexical citation forms which are typically used in the constitutional case law. In this preliminary investigation we were interested in the extraction of references to other constitutional judgements and discarded references to judgements of other (lower ranked) courts. The evaluation of the parser on a sample of 608 manually annotated citation contexts from a set of 60 cases evenly distributed over time, resulted in a precision of 98.4% and a recall of 91.7% (see Table 1). #documents #citation contexts #manually annotated references #correctly extracted references #wrong extracted references #not extracted references Accuracy Precision Recall F1

60 608 1294 1170 18 106 90.4% 98.4% 91.7% 94.9%

Table 1. Evaluation of the judicial citations parser on the extraction of constitutional references.

The application of the reference parser to the textual corpus allows to automatically obtain the processable graph of citations.

Nodes in the graph are labelled with their date and typology but can potentially be annotated with all the metadata associated with documents in the dataset (subjects, judges, related norms,...). In this first part of the analysis we considered the subset of the graph corresponding to a particular class of verdicts, i.e. in via incidentale, which represent nearly the 80% of the whole constitutional corpus. The latter are decisions on the constitutional legitimacy of norms raised by judges in a trial. The other main typology of constitutional legitimacy claims “in via principale” are those promoted by state institutions (government, parliament, regional governments) before the Court. The distribution of the typology of judgements is illustrated by Table 2: Type of judgement

Percentage

in via incidentale

78.64%

in via principale

11.67%

other

9.69%

Total number of judgements 19312 Table 2. Distribution of judgement typology in the decisions dataset.

On the selected type of judgements, that is in via incidentale, we started with the analysis of the network of internal citations, namely references of the Court to its own precedents, leaving aside the citations of other courts. The type of link (internal or external) is easily distinguished by filtering the issuing authority in the ECLI (third field on court codes). The application of the parser to the selected corpus of judgements, i.e. in via incidentale, resulted in the distribution reported in Table 3. Type of reference

Percentage

Constitutional

58.7%

to other courts

41.3%

Total number of references 98081 Table 3. Distribution of extracted judicial citations.

From the processed data we were able to construct and analyse the Constitutional Court citation network using Gephi and the Gephi-toolkit in a Java environment.

The overall citation network consists of 57584 citations distributed among 16320 nodes (15186 are incidentali). The basic topological properties of the resulting graph were computed with standard graph analysis algorithms, which are reported in Table 4 below. Metrics

Value

Number of nodes

16320

Number of edges

41858

In-degree

[0-74]

Out-degree

[0-109]

Average degree

2.56

Network diameter

23

Average path length

6.89

Table 4. Network properties.

We then analysed the topology of the network in terms of the degree distribution of its nodes. The first question we addressed is whether the distribution of the citation “edges” among the source nodes in the network follows a power law distribution. This is to say that a small number of cases receive a large number of citations, whereas most of the cases have few citations or none at all.

Figure 1. Fitting of power law function with degree distribution.

Fitting a power law function with the degree distribution of the network (as reported in Fig. 1 on a log-log scale with best fitting estimated at α = 2.391) suggests that the citation

graph exhibits the properties of a scale free network. This fact, though not surprising, supports a number of hypotheses that are thoroughly discussed in Section 3. The second step of the analysis was the evaluation of basic measures concerning the importance of the nodes. We discarded the nodes corresponding to the ordinances of the Court, so as to restrict the focus of the analysis on the network made of the in via incidentale rulings. In a directed graph the in-degree centrality measure counts the number of incoming edges (citations) for each node (judgement) of the network. This sheds light on the importance of a case. However, this measure does not fully exploit the information available in the network, since it treats all the inward citations in exactly the same way. As proposed by Fowler and Jeon (2008) following Kleinberg (1998), we may distinguish between hubs (i.e. cases that cite many other decisions) and authorities (i.e. cases that are widely cited by other decisions). Accordingly, we can attune the in-degree centrality measure by taking into account how hubs and authorities mutually influence their scores, i.e. a node is a good hub if it cites many good authorities and a node is a good authority if it is cited by many good hubs. In Table 5 the top 30 cases of the whole corpus are ranked according to their in-degree. The same ranking holds with respect to the authority score. Out-degree (count of outward citations) and hub measures are also reported in the table.

Rank

Identifier

In-degree Out-degree

Authority

Hub

1

ECLI:IT:COST:2007:S349

43

43

0.0016747231 0.001895789

2

ECLI:IT:COST:2007:S348

37

48

0.0014463518 0.0017180226

3

ECLI:IT:COST:1963:S129

36

2

0.0014082899 0.0015517623

4

ECLI:IT:COST:1956:S1

36

0

0.0014082899 0

5

ECLI:IT:COST:1970:S190

36

7

0.0014082899 0.0017180226

6

ECLI:IT:COST:1988:S971

33

1

0.0012941043 0.0014963422

7

ECLI:IT:COST:1980:S5

31

1

0.0012179804 0.0015517623

8

ECLI:IT:COST:1994:S6

31

11

0.0012179804 0.0014963422

9

ECLI:IT:COST:1957:S118

30

5

0.0011799185 0.0016626025

10

ECLI:IT:COST:1988:S822

29

3

0.0011418567 0.0015517623

11

ECLI:IT:COST:1990:S313

27

10

0.0010657329 0.0013300821

12

ECLI:IT:COST:1968:S86

27

2

0.0010657329 0.0013300821

13

ECLI:IT:COST:1994:S397

26

28

0.001027671

0.0013300821

14

ECLI:IT:COST:1962:S106

26

0

0.001027671

0

15

ECLI:IT:COST:1961:S21

26

0

0.001027671

0

16

ECLI:IT:COST:1962:S87

26

1

0.001027671

0.001385502

17

ECLI:IT:COST:1967:S22

25

2

0.000989609

0.001385502

18

ECLI:IT:COST:1957:S3

25

0

0.000989609

0

19

ECLI:IT:COST:1957:S46

25

0

0.000989609

0

20

ECLI:IT:COST:1968:S55

25

7

0.000989609

0.001274662

21

ECLI:IT:COST:1968:S75

25

2

0.000989609

0.0011638218

22

ECLI:IT:COST:1962:S88

25

1

0.000989609

0.0012192419

23

ECLI:IT:COST:1983:S148

25

12

0.000989609

0.0011084017

24

ECLI:IT:COST:2000:S525

24

5

0.00095154723 0.001274662

25

ECLI:IT:COST:1996:S89

24

1

0.00095154723 0.00094214146

26

ECLI:IT:COST:1957:S59

24

0

0.00095154723 0

27

ECLI:IT:COST:1966:S26

24

23

0.00095154723 0.001274662

28

ECLI:IT:COST:1993:S243

23

21

0.0009134853 0.0011084017

29

ECLI:IT:COST:1984:S170

23

5

0.0009134853 0.0011638218

30

ECLI:IT:COST:1993:S306

23

13

0.0009134853 0.001274662

Table 5. Top in-degree – in via incidentale judgements

Table 5 offers, so to speak, the synchronic picture of our case law, that is the network as a whole for the entire history of the Italian Constitutional Court. Yet, a more fine grained analysis can be performed by restricting the analysis to a specific time period so as to follow the evolution of the network. Experts of Italian constitutional law usually distinguish three phases: i) ii)

From 1956 to 1970: the Court was mostly focused on the legitimacy of the laws passed before the 1948 Constitution; From 1970 to 1989: most of the work was on the enforcement of the new constitutional principles and rules;

iii)

From 1989 to nowadays: the era of the amendments to the 1948 Constitution and their legitimacy.

For the sake of conciseness, suffice it to dwell here on the period 1989-2014 (recent rulings).The network is not only composed by the decisions the Court made in this time interval, but includes older cases outside this interval if they are cited by recent decisions. Citations among them are thus restricted to those occurred in the observed time interval. Table 6 reports node measures and ranking for the network of recent rulings. Rank

Identifier

In-degree Out-degree

Authority

Hub

1

ECLI:IT:COST:2007:S349

43

43

0.0024485253 0.004511457

2

ECLI:IT:COST:2007:S348

37

48

0.0021146354 0.003680399

3

ECLI:IT:COST:1988:S971

33

0

0.0018920423 0

4

ECLI:IT:COST:1994:S6

31

11

0.0017807457 0.0032055087

5

ECLI:IT:COST:1988:S822

29

0

0.001669449

6

ECLI:IT:COST:1990:S313

27

10

0.0015581525 0.0028493411

7

ECLI:IT:COST:1994:S397

26

28

0.0015025042 0.0028493411

8

ECLI:IT:COST:1996:S89

24

1

0.0013912076 0.0020182834

9

ECLI:IT:COST:2000:S525

24

5

0.0013912076 0.0027306185

10

ECLI:IT:COST:1993:S243

23

21

0.0013355593 0.002374451

11

ECLI:IT:COST:1993:S283

23

25

0.0013355593 0.0028493411

12

ECLI:IT:COST:1993:S306

23

13

0.0013355593 0.0027306185

13

ECLI:IT:COST:2009:S94

23

38

0.0013355593 0.0018995607

14

ECLI:IT:COST:1990:S155

22

4

0.0012799109 0.0024931733

15

ECLI:IT:COST:1991:S119

21

11

0.0012242626 0.0022557285

16

ECLI:IT:COST:2009:S236

21

14

0.0012242626 0.0017808381

17

ECLI:IT:COST:1983:S148

21

0

0.0012242626 0

18

ECLI:IT:COST:2009:S311

20

8

0.0011686144 0.0022557285

19

ECLI:IT:COST:1980:S5

20

0

0.0011686144 0

0

20

ECLI:IT:COST:1995:S432

19

5

0.0011129661 0.0020182834

21

ECLI:IT:COST:2007:S234

19

14

0.0011129661 0.0018995607

22

ECLI:IT:COST:1996:S417

19

10

0.0011129661 0.0020182834

23

ECLI:IT:COST:1994:S341

19

7

0.0011129661 0.0015433931

24

ECLI:IT:COST:1995:S421

19

13

0.0011129661 0.0021370058

25

ECLI:IT:COST:1993:S39

19

7

0.0011129661 0.0020182834

26

ECLI:IT:COST:1988:S364

19

0

0.0011129661 0

27

ECLI:IT:COST:1996:S131

18

11

0.0010573177 0.0021370058

28

ECLI:IT:COST:1993:S197

18

6

0.0010573177 0.0018995607

29

ECLI:IT:COST:1999:S229

18

11

0.0010573177 0.0021370058

30

ECLI:IT:COST:1995:S313

18

4

0.0010573177 0.0017808381

Table 6. Top in-degree– in via incidentale Judgements; Restricted interval 1989-2014.

In light of the results of Table 6, and by comparing them with the overall picture of Table 5, the time is ripe for examining the content of this networked case law. 3. Legal networks and their information In order to discuss the results presented in the previous section, we need to preliminarily set the proper level of abstraction. As occurs with all the network approaches to the law, ours presents the legal phenomenon in terms of information. However, the connection between law and information can be viewed from three different perspectives: legal information “for” reality, “as” reality, and “about” reality (Floridi 2009; Pagallo 2006). First, legal information “for” reality is the most usual and even important type of information in the legal domain, since information is here conceived as a set of rules or instructions for the determination of other informational objects. This is the common ground for every kind of legal positivism, normativism, or imperativism whatsoever. Also, legal information “for” reality is relevant to some aspects of the natural law tradition, of constitutionalism, of institutionalism, and so forth. Yet, this type of instructional information with a signalling function is most of the time not directly pertinent in our field: although we deal with the rulings of the Italian Constitutional Court as information “for” reality, the findings of the network analysis in the legal arena do not aim to immediately determine how individuals should behave in a given territory. Therefore, we can skip such a level of abstraction in this context.

Second, legal information “as” reality refers to data which can be meaningful independently of an intelligent “producer/informer” (Floridi 2009: 32). This perspective has become popular in the field of evolutionary psychology, according to which basic human concepts would be wired in our brains thus grounding cultural evolution. In addition, this stance specifies what all the variables of the natural law tradition have in common, namely the examination of being qua being, or its essence, so that law could be found in the nature of the individuals, of the world, of the things, i.e. the German “Natur der Sache.” Likewise, legal information “as” reality is at stake with the results of network analysis in such cases as, say, the power law distribution of information in the case law of the US Supreme Court (Fowler and Jeon 2008), of the EU Court of Justice (Malmgren 2011), or of the Italian Constitutional Court. When we started analysing the case law of the latter, we were pretty confident that the result would be similar to previous work on the jurisprudence of the US Supreme Court, of the EU Court of Justice, etc. Hence, the most typical and genuine of the philosophical questions: Why? A possible explanation is simple and straight. Such features of complex networks, as their power laws and the presence of hubs, optimize the flow of information in the system. Interestingly, this was the suggestion of Herbert Simon in his seminal The Sciences of the Artificial (1969, ed. 1996). Simon’s notion of “nearly decomposable systems” proposes hierarchy as the clue for grasping the architecture of complexity, in that “most things are only weakly connected with most other things; for a tolerable description of reality only a tiny fraction of all possible interactions needs to be taken into account” (op. cit.) Additionally, Simon’s “empty world hypothesis” can be properly grasped with the notion of hubs, for this small fraction of nodes in the network with a much higher degree of connectivity than the average, explains the clusters of dense interaction in the chart of informational exchange by offering the common connections mediating the short path lengths between the nodes of the network. This does not mean, of course, that all legal systems should present small world-features, power laws of distribution, etc. In Boulet et al. (2010), for instance, it is argued that the French environmental code presents a small world-structure and yet, it contrasts with the reference network of all the French legal codes. Although the latter shows “a rich club of ten codes very central to the whole… system,” this reference network has no small world properties at all. Eventually, we should stress the difference between human political planning and the unintentional emergence of spontaneous orders, that is, what Friedrich Hayek used to dub as the distinction between taxis and kosmos. Still, what matters here should not be the subject-matter of an ideological debate on law “as” information. Rather, the issue concerns true semantic content as a necessary condition for knowledge, namely our third and final level of abstraction. Legal information “about” reality has in fact to do with matters of knowledge and concepts that frame the representation and function of a given system, and inform us about the different states of reality. This is the bread and butter of the sociological approaches to the legal phenomenon, and of some variants of legal realism as well. For example, consider the American (as opposed to the Scandinavian) legal realism and the idea that the law is a sort of prophecy of what the courts will do in fact. Legal

information “about” reality is particularly relevant for the work of network theory, since legal information is here alethically qualifiable as semantic information with factual content about the states of the world. What is then the semantic content, alethically qualifiable, of our research? All in all, work on legal information “about” reality aims to shed light on the peculiarities of the system under scrutiny, notwithstanding the “family resemblances” with other case law networks. As a matter of legal information “as” reality, we already mentioned the similarities of such networks as the case law of the US Supreme Court, of the EU Court of Justice, and the topic of this paper: the case law of the Italian Constitutional Court. As a matter of legal information “about” reality, the differences between such networks have now to be fleshed out. In their 2005 work on the US Supreme Court jurisprudence, for example, Fowler and Jeon showed the semantic evolution of this network: whereas the most authoritative cases before the American civil war involved freedom of contract, namely the contract clause, after that war and until the end of the 1930s with the New Deal, the main core became balance of power in order to regulate commercial issues in a federal system. Then, since the late 1950s, the Supreme Court shifted its focus towards civil liberties and especially, freedom of speech. In the case of such a legal network as the jurisprudence of the Italian Constitutional Court, things are necessarily different. Over almost 60 years of evolution, the most authoritative cases insist on two specific fields: 5 of the first 15 hubs in the network regard labour law, 3 criminal law (which become 6 if we take into account the first 25 hubs of the network). No Italian would be surprised by such results! Here, let us illustrate the outcomes of our analysis with two examples on “paradigm shifts” and “legal transplants.” In the case of legal information “as” reality, the discussion is naturally universal and all-embracing. Dealing with legal information “about” reality, focus is on certain specific structures, concepts, or trends of the legal phenomenon. 3.1 Paradigm shifts It is noteworthy that, at the top of the overall top cases of our network, we find the so called “twin rulings,” namely case 349 from 2007 (# 1), and 348 from 2007 (# 2). In these cases, the Italian court had to define how the European Convention on Human Rights (“ECHR”), much as the jurisprudence of the Court in Strasbourg, interact with the Italian legal system. The alternative was (and still is) between a standard international “dualistic” approach and a “monistic” interpretation of such legal interaction. In the first case, dualism means that both norms and rulings of international law have either to be implemented by the national legislator, or to be filtered out by the national supreme court. In the second case, monism means that both legal systems, national and international, are grasped as a single unified network, so that rulings and norms at the international level should directly be applied to the national legal system. The latter is the typical approach of the EU Court of Justice (“CoJ”) that some national constitutional courts, such as the German Bundesverfassungsgericht and the Italian Consulta, have resisted over the past decades. For example, in the case of the Italian Constitutional Court, it has opposed its standard dualistic approach to the rulings of the EU CoJ in Costa vs Enel

(1964), Frontini (1973), Industrie Chimiche (1977), and so forth. The Italian court finally changed its mind in Granital (case 170 from 1984). Although justices in Rome formally maintained their dualistic approach, they substantially accepted the monistic conclusions of the colleagues in Luxembourg, such as the “supremacy of the EU law,” the direct applicability of the communitarian norms, etc. As a result, a paradigm shift occurred in Italy and this is why Granital is still ranked # 29 amongst the overall cases in the network. However, how about the interaction between ECHR and the Italian legal system? Should it be grasped in accordance with the standard international dualistic approach, pursuant to Article 10 of the Italian Constitution, or in a similar way to the supremacy and direct applicability of the EU law, at least after Granital, pursuant to Article 11 of the Basic Law? In a nutshell, the Court rejected a Granital-like approach, thus maintaining the standard dualistic interpretation of the ECHR legal system. There are many reasons why this conclusion is debatable and, unsurprisingly, this is why the Court had to return time and again to the arguments of the “twin rulings” over the past 7 years. What turns a ruling into a hub of the network is not its theoretical relevance. Rather, a hub shows the reasons why in a certain field of the system, the latter is under pressure. 3.2 Legal transplants On 22 September 1988, Italy adopted a new code of criminal procedure. It was a case of legal transplant, according to the formula of Alan Watson (1974), since the aim was to substitute the previous inquisitorial system with an adversarial system, typical of the common law tradition. This is not the only case in which such a sort of legal transplant has occurred in Italy: On 31 December 1996, for instance, the EU data protection directive 46 from 1995 was implemented by the Parliament of Rome with a law that contributed to create a privacy culture in Italy. (Remarkably, Italians still refer to this right using the English word “privacy.”) Yet, as occurs with any other transplant, also legal transplants entail a risk of rejection. In the case of the Italian 1988 code of criminal procedure, a number of the new provisions on the role of the parties and their powers, on the notion of procedural truth, etc., contrasted with some principles of the Italian constitution and the legal culture of this country. Our network analysis casts light on this rejection in a twofold way. First, we can observe it in light of Table 5 above: ruling 313 from 1990, with which the Italian Consulta declared invalid some provisions of the code on plea bargaining, ranks # 11 amongst the hubs of the network. Decision 89 from 1996 is # 25. However, if we restrict the spectrum of the analysis, and consider only the set of rulings from 1989 to mid 2014 in accordance with Table 6 above, we obtain a more fine grained picture of the rejection: ruling 313/1990 becomes # 6 of the ranking, whereas judgement 89/1996 is # 8. In addition, we find ruling 432 from 1995 as # 20, 131 from 1996 as# 27, etc. As a result, hubs do not only show the fields of the law that can be under pressure but also, how the latter evolves and varies as time goes by (Pagallo 2010).

4. Conclusions This paper succinctly presented the ways in which we constructed the citation network of the in via incidentale rulings of the Italian Constitutional Court, distinguishing the results in accordance with three levels of abstraction on legal information “for” reality, “as” reality, and “about” reality. We presented similarities and divergences with previous work on network analysis and jurisprudence. On the one hand, one of the most striking results of the 10 years old-research in legal networks is their “family resemblance,” e.g. the power law distribution of information that spontaneously flows in such networks as the case law of the U.S. Supreme Court, of the EU Court of Justice, and of the Italian Constitutional Court. On the other hand, we insisted on the differences among such networks as a matter of knowledge and concepts that frame the representation and function of a given system, that is, legal information “about” reality. Further steps of this research include, not only but also, a more extensive examination of how the case law of the Italian Constitutional Court has evolved throughout the decades, along with the comparison of this network and its hubs with other criteria, such as the most searched cases in legal databases, the number of academic papers and essays for every decision of the Court, down to evidence provided by legal scholars on the basis of their knowledge on the subject-matter. The overall aim is both to increase our legal understanding of true semantic content as a necessary condition for knowledge and to grasp the multiple ways in which the three levels of abstraction on legal information and reality interact. Going back to the stance on legal information “about” reality, for instance, it seems fair to affirm that the latter can really help legislators when they set the rules, or instructions, for the determination of other informational objects in the system, i.e. legal information “for” reality. As shown by the example on legal transplants mentioned above in Section 3.2, the price of ignoring semantic information with factual content may end up with a phenomenon of legal rejection. References Bacci, L. Francesconi, E. and Sagri, M (2013) A Proposal for Introducing the ECLI Standard in the Italian Judicial Documentary System, in Proceedings of JURIX. 2013, edited by K. Ashley, 49-58. IOS Press, Amsterdam; Boulet, R., Mazzega, P. and Bourcier, D. (2010) Network Analysis of the French Environment Code, in AI Approaches to the Complexity of Legal Systems. Complex Systems, the Semantic Web, Ontologies, Argumentation, and Dialogue, edited by P. Casanovas et al., 39-53. Springer, Dordrecht; Floridi, L. (2009) A Very Short Introduction to Information. Oxford: Oxford University Press; Fowler, J. H., and Jeon, S. (2008) The Authority of Supreme Court Precedent, Social Networks, 30: 16-30; Kleinberg, J.M. (1998) Authoritative Sources in a Hyperlinked Environment, in Proceedings of ACM-SIAM Symposium on Discrete Algorithms, pp. 668-677;

Malmgren, S. (2011) Towards a Theory of Jurisprudential Relevance Ranking: Using Link Analysis on EU Case Law. Master of Laws degree at Stockholm University under the supervision of C. Magnusson Sjöberg; Pagallo, U. (2006) Teoria giuridica della complessità. Torino: Giappichelli; Pagallo, U. (2010) As Law Goes By: Topology, Ontology, Evolution, in AI Approaches to the Complexity of Legal Systems. Complex Systems, the Semantic Web, Ontologies, Argumentation, and Dialogue, edited by P. Casanovas et al., 12-26. Springer, Dordrecht; Simon, H. A. (1996) The Sciences of the Artificial. MIT Press, Cambridge Mass. and London; Watson, A. (1974) Legal Transplants: An Approach to Comparative Law, Edinburgh and London: Scottish Academic Press.

Prominent Actors in Italian Civil Judiciary: a Social Network Analysis study Gabriele Rinaldi Courthouse of Messina Messina, Italy ([email protected]) Giacomo Fiumara∗ Department of Mathematics and Computer Science University of Messina, Italy ([email protected])

Abstract One of the main problems of Italian Civil Judiciary is constituted by the long durations of civil trials. In this paper we propose a model of semantic network in which are represented all the prominent actors of Civil Judiciary together with their interactions. The model is applied to a dataset of civil trials extracted from the databases of the Courthouse of Messina and the techniques of Social Network Analysis are applied. A conclusion of our (yet preliminar) study is that long durations can be ascribed to the strong unequal distribution of workload among actors. ∗

Corresponding author

1

1

Introduction

In this paper we show the results of our work concerning the modelization and analysis of the network of relations among actors of civil lawsuits in Italy. Our aim is to identify strong and weak points of the system, together with some features of the main actors. To this purpose we extracted data related to a series of civil lawsuits completed or in progress during the first six months of 2009. Due to privacy reasons, all the actors have been anonymized. The reason for the choice of this dataset resides in the fact that it contains all civil lawsuits which were recorded when the computerization of the activivities of the Courthouse of Messina started. The central entity in Italian Civil Judiciary is the case file, which is assigned to a court (or a single judge) and is related to a controversy among two or more parties. Each of the parties is defended by one or more lawyers. Judges may appoint advisors for technical masteries in fields such as Computer Science, Finance, Ballistics, Genetics and so on. Unlike the social networks, in which relations usually connect entities of the same kind, namely people, in this work we built a semantic network in which are connected the aforementioned actors (see Figure 1). In our model there are some assumptions: there are no direct interactions among judges, nor among technical advisors. We must assume also that no interaction exists among lawyers, even if they defend the same party. This can appear a severe limitation, but since our interactions are extracted from the databases of the Courthouse of Messina, we have no evidence of such interactions.

2

Social Network Analysis

Social Network ANalysis relies on the concept of centrality measure to draw attention to those nodes (actors) which play important roles inside the network [5]. In the following we briefly recall the centrality measures used in the analysis of our model network.

2.1

Degree centrality

The degree centrality of a node is defined as the number of edges adjacent to this node. It is a simple yet significant way to measure the importance of 2

(b)

(a)

Figure 1: The semantic model of Italian civil lawsuit (a) and its representation as a network (b)

Figure 2: The network under examination (the colors of nodes has the same meaning as in Figure 1b)

3

Figure 3: The degree centrality distribution nodes in a network. For an undirected graph G = (V, E) with n nodes, we can define the degree centrality measure as d(v) , (1) n−1 where d(v) is the number of edges adjacent to the node v. Since a node can at most be adjacent to n − 1 other nodes, n − 1 is the normalization factor introduced to make the definition independent on the size of the network and to have 0 ≤ CD (v) ≤ 1. A node with a high degree can be seen as a hub, an active node and an important communication channel. We are interested in the degree distribution of judges. Infact, as a consequence of the structure of out model, judges are directly connected only to case files and therefore the degree distribution is an immediate measure of workload distribution. The degree distribution of the actors involved in Civil Judiciary is shown in Figure 3. The first conclusion that can be drawn is that the workload is strongly unequally distributed among actors. CD (v) =

4

According to our model, judges have assigned a case file, and therefore they have relationships only with case files. Therefore, the degree of various judges coincides with the number of case files assigned to them. A limited number of judges (red line) has a large amount of workload while many more have a little number of case files. The same holds for advisors (green line): the overwhelming majority is involved in very few trials, while a little number has to manage a large number of trials. This has to do with recognized technical competencies of advisors, but probably there is also little communication among judges when advisors are appointed. The consequence is that the same little number of technical advisors has to manage the majority of trials. Also lawyers are chosen according to parameters such as confidence and distinction.

2.2

Closeness centrality

Closeness centrality is defined as the inverse of the sum of the shortest paths existing between a node and all other nodes in the graph [7]. It measures how central a node is with respect to information spreading and how far a node is with respect to all other nodes of the graph. Closeness centrality can be expressed as 

Cc (i) = 

N X

−1

d(i, j)

j=1

In Figure 4a is shown the closeness centrality distribution for the various actors of the network. The actors having higher values of closeness centrality are lawyers. Indeed, they are involved in more trials and therefore are “attached” to more case files than any other typology of actors. Secondly, parties are more “near” to other actors. In our opinion, this takes into account how contentious some parties are, involved as they result in more trials.

2.3

Betweenness centrality

Betweenness centrality [4] is a centrality measure of a node. It is the number of shortest paths existing every pair of nodes of the graph traversing a given

5

(a)

(b)

Figure 4: Closeness centrality (a) and Betweenness centrality (b) distributions nodes. The idea is that the more central the node is, the highest will be the number of shortest paths traversing it. The betweenness centrality can be expressed as: CB (v) =

X s6=v6=t∈V

σst (v) σst

Betweenness centrality distribution for the actors of our sample of Civil Judiciary is shown in Figure 4b. Not surprisingly, highest values are exhibited by judges who actually often represent the shortest way to traverse the network. On the contrary, parties and lawyers are peripheral with respect to our model and therefore exhibit relatively low values of betweenness centrality.

2.4

Density

For an undirected graph G = (V, E) with n nodes and m edges, we can define the density as D=

2m , n(n − 1)

6

(2)

Graph Metric Graph type Vertices Edges Connected Components Diameter Average Geodesic Distance Density Average Degree Average Weighted Degree Modularity Communities

Value Undirected 1958 2722 1 18 6.89 0.0014 1.39 3.73 0.798 21

Due to the structure of this network, a very low value of density is obtained (0.0014).

2.5

Geodesic Distance

The distance between two nodes in a network is the number of edges in a shortest path that connect them. Directly connected to the concept of distance is the diameter of a network, which is the maximum distance between any pair of nodes.

3

Communities

A network exhibits a community structure if its nodes can be grouped into collections of nodes such that the number of links existing in each collection is significantly higher than the number of links among collections. In the case of non-overlapping communities, the network naturally splits into collections of nodes. The problem of finding communities in a network can be solved by adopting the concept of network modularity which can be expressed as follows: given a network, represented by means of a graph G = (V, E), which has been partitioned into m communities, its corresponding value of network

7

modularity is m X



l ds  s − Q= 2|E| s=1 | E |

!2  

(3)

assuming ls the number of edges between vertices belonging to the s-th community and ds is the sum of the degrees of the vertices in the s-th community. High values of Q imply high values of ls for each discovered community, yielding to communities internally densely connected and weakly coupled among each other. For a comprehensive survey on community detection problem, the reader is referred to [3], while a detailed discussion on some algorithms can be found in [1, 2]. The high value of modularity induced us to explore the communities. We adopted the greedy algorithm of Newman [6]. This algorithm works as follows: the graph is initially considered as a collection of isolated nodes. An edge is first added which increases the modularity of the whole network with respect to the previous configuration. All other edges are added according the same principle: the addition of an edge which decreases modularity is rejected, while may happen that a new edge does not modify the overall modularity (this happens when the new edge is internal to one of the clusters). The algorithms stops when the addition of new edges does not produce more increases in modularity. The outcome of this algorithm was that 21 communities emerged. The size distribution of the various communities is shown in Figure 5, while the largest communities are shown in Figures 6 and 7. An interesting feature is represented by the fact that all the largest communities, with one exception, are centered around one judge. The exception is represented by the community shown in Figure 7a, in which two judges are present. This is due to the occurrence of the same advisors appointed by the two judges and some lawyers that happen to defend parties in front of the two judges.

4

Conclusions

In this paper we presented a model network aiming at representing the interactions existing among the actors of the Italian Civil Judiciary [8]. We extracted data relative to civil lawsuits enrolled in a period long six months in the CourtHouse of Messina and represented them as a semantic network whose nodes embody the various actors: judges, technical advisors, lawyers, 8

Figure 5: The size distribution of communities

(a)

(b)

Figure 6

9

(b)

(a)

Figure 7 parties and case files. The network we obtained was studied using the techniques of Social Network Analysis. The results we obtained show that one of the long-standing problems of Italian Civil Judiciary, namely the long duration of civil lawsuits, may be ascribed to the unequal distribution of workload among actors. There may be some reasons for this. Some technical advisors may be more competent than others, while some judges may have reached a great experience with some types of lawsuits so that all that particular kind of laswuits are assigned to them. Moreover some lawyers may have more charisma than others and therefore are chosen more frequently than others. Nonetheless, the unequal distribution of workload severely hampers a timely conclusion of civil lawsuits. Future improvements to our study may include a bigger dataset (the one we used for this study contains case files enrolled in the first six months of 2009), a detailed analysis of the various types of trials and the analysis of the evolution of the network over time.

References [1] P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti. Generalized Louvain method for community detection in large networks. In Proc. 11th International Conference on Intelligent Systems Design and Applications, pages 88–93. IEEE, 2011.

10

[2] P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti. Enhancing community detection using a network weighting strategy. Information Sciences, 222:648–668, 2013. [3] S. Fortunato. Community detection in graphs. Physics Reports, 486(3-5):75–174, 2010. [4] L. Freeman. A set of measures of centrality based on betweenness. Sociometry, pages 35–41, 1977. [5] C. Morselli. Assessing vulnerable and strategic positions in a criminal network. Journal of Contemporary Criminal Justice, 26(4):382–392, 2010. [6] M. Newman. Fast algorithm for detecting community structure in networks. Phys. Rev. E, 69(6):066133, 2004. [7] M. Newman. A measure of betweenness centrality based on random walks. Social Networks, 27(1):39–54, 2005. [8] G. Rinaldi. I principali attori del processo civile e le interazioni tra di essi: modellizzazione mediante un grafo semantico e sua analisi. Master Degree in Informatics, University of Messina, 2010.

11

Reference Structures of National Constitutions Bart KARSTENS a Marijn KOOLEN a Giuseppe DARI-MATTIACCI b Rens BOD a and Tom GINSBURG c a Institute for Logic, Language and Computation, University of Amsterdam, The Netherlands b Amsterdam Centre for Law and Economics, University of Amsterdam, The Netherlands c University of Chicago Law School, United States of America Abstract. Interpretability of legal texts is an important condition for the effective usage of the law. This holds even more strongly for constitutions because, next to legal experts, common citizens need to be able to understand the contents of the constitution in order to enforce its regulations. Previous research [11] has shown that variance in interpretability of constitutions is mainly a function of purely textual characteristics. One of these characteristics, which has not been subjected to systematic research, is the internal reference structure of a text. In this paper we present a first analysis of reference structures of national constitutions. We set up reference structures for all current constitutions and analyze them through (1) network analysis and (2) correlation with a number of other factors, such as length, number of topics covered and colonization history. We claim that an increase in the sheer number of references does not result in itself in a decrease in interpretability. It depends on the degree of organization of referential complexity whether this is the case. To back this claim we present a comparison between countries with low and high degrees of organization of their reference structures. Finally, our analysis yields a number of factors that help explain differences in degree of referential complexity. Keywords. constitutions, reference structure, complexity, interpretability, network analysis

1. Introduction In [11] it is argued that interpretability is an important virtue for all legal texts, but even more so for constitutions because ordinary citizens have to be able to read the constitution well to properly enforce its regulations. Moreover constitutions ought to be unifying and time-independent documents. They can only ensure national unity, inter-generational commitment and self-enforcement when the majority of people is able to understand them. Constitutional interpretability varies considerably. [11] have tested a number of hypotheses to find out why this is so. Interestingly they found that interpretability is primarily a function of textual characteristics, such as composition and structure of the text, and not so much of other things one would expect, such as distance in time to drafting or distance in culture.

Still, even when attention is narrowed to textual characteristics there is no unequivocal method to measure complexity of the law. This means that we have to rely an assessments of particular virtues. This is further complicated by the fact that interpretability may come into conflict with other virtues. An assessment of legal complexity must balance between such virtues as essentiality of norms, accuracy (legal security), applicability and genericity (hence simplicity), see [2] p.335. When drafting a constitution a trade-off takes place between these virtues, governed only by the demand of legal coherence. In [11] it is pointed out that clarity and internal consistency may not always be desirable over vagueness. To ensure agreement on a general principle like, for example freedom of speech, we do not specify exactly what speech is. Further, vagueness can also help to avoid having permanent constitutional losers, as with vagueness there still is room to bargain.1 On the other hand, when exactness of regulaton is required, this may also run against interpretability. Balancing between virtues depends in part on the interests of the parties involved in drafting constitutions. Constitutions are conventions, in which ultimate effectiveness is determined by the players themselves rather than external actors. Demands of scope, that is the number of topics adressed, and detail, pull in different directions. Long constitutions can capture both, but if consitutions are short we must expect a trade-off. In this paper we propose to measure complexity of national constitutions through reference structure analysis. We consider reference structures as distributed networks. This mode of representation allows us to perform network analysis. The potential usefulness of this approach has also been noted by others: The sciences that study complex systems, whether natural or artificial, provide concepts and tools that may be used to promote the emergence of new approaches to law and legal systems. [2] p.335 . In [11] the reference structure of a text is explicitly mentioned as an important measure of readability: ”Consider this passage from the Kenyan Constitution of 1963 (Art. 181.1), which refers the reader to six different sections in order to qualify the powers of the court of appeal: Subject to the provisions of sections 50(5), 61(7), 101(5) and 210 (5) of this Constitution and of subsection (4) of this section, an appeal shall lie as of right to the Judicial Committee from any decision given by the Court of Appeal for Kenya or the Supreme Court in any of the cases to which this subsection applies or from any decision given in any such case by the Court of Appeal for Eastern Africa or any other court in the exercise of any jurisdiction conferred under section 176 of this Constitution. These types of lexical gymnastics are not rare in our experience. A logical predictor of interpretability, then, is the linguistic complexity of the text’s syntax. Our hypothesis is that more linguistically complex texts should be harder to interpret. ” This hypothesis has been tested in general in [11], but not specifically in relation to reference structures. It is our aim to gain more systematic insights in this aspect of textual complexity. We are especially interested in the relation between reference patterns, scope of topics and length of texts, because scope and detail are given as the most important factors influencing readability. Next to this we are also interested in identifying common causes for cases in which numerous references occur. We must consider whether 1 [6] offers a view on constitutions as living organisms. Flexibility, that is the ability to adapt to changing circumstances, is given as a strong indicator of constitutional longevity.

these causes are irreducible aspects of national constitutions or whether they allow for reduction of complexity.2

2. Generating reference structures We have written a parser in Perl to create the reference structures of all current constitutions. The texts of these were available in English set in HTML through the constitutions project website.3 We have traced internal references only, and excluded external references to other texts.4 For each constitution we have retrieved the number of references, the level on which the reference occurs (article, chapter, paragraph, etc.) and whether it is a backward or forward reference. We have experienced problems with multiple references like ’see article 1-5’, which should amount to 5 references instead of 1. This was solved through adding ’unpacking’ script. We also had problems with the proper recognition of numerals and ordinals. This was solved by translating them first into numbers (in the right context) and then parse the text. The most challenging of problems came from detecting anaphora such as ’preceding’, ’previous’ , ’this’. We have decided for the moment to ignore these, as they mostly occur with respect to inner-article referencing. 5 The parser is still a bit coarse and if possible needs to be refined. When the use of numbers is not well ordered results become messy. This is for example the case with potentially interesting countries with a high number of references such as Sweden, SouthAfrica and Nigeria. We also would like to have a measure of vicinity of the target article to the source article because we suspect that a high vicinity ratio (i.e. source and target are close) indicates that there is less need for ’lexical gymnastics’. Our measure of weakly connected components excludes the direction of references, but a measure of strongly connected components, hence including direction of reference, provides other difficulties (see the remark on the measure of path-length below). Notwithstanding these difficulties, we believe our research has already produced a number of interesting findings. The percentage of countries having little or no references is great. Yet there is also a considerable group of countries with 100 or more references. Most of these are former British colonies, with the group of West-Indies constitutions, strikingly at the top-end. Of the 39 countries with 100 or more references, 35 are former British colonies (if we include Greece). The other four, Sweden, Thailand, Belgium and Germany, have constitutions which are not related to a decolonization process (see Table 1). These countries are democracies and democracies tend to have longer constitutions than other political regimes (See [8]). Length (in terms of number of words) and number of references have a correlation of about 0.58, which means that not all increase in number of references can be explained with reference to increase in length. 2 In [2] it is claimed that articles of a constitution cannot lean too much on other articles because that would make these articles less constitutional. A French Constitutional Council has lambasted the occurence of such legal complexity in constitutions, and adviced to reduce it, whenever possible. 3 See http://comparativeconstitutionsproject.org 4 Interdependence of legal texts has been the main focus on research on reference resolution, see for example [5] and [7]. As constitutions ought to enforce themselves, intertextuality is less relevant in the context of constitutions. 5 [5] provides a good overview and a useful discussion of these parsing problems.

On the other side of the scale we find all French former colonies, having very few references (Gabon scores highest with 17, see Table 2).6 There is an issue of drafting style involved here, reflecting the civil law vs. common law distinction. Civil law is characterized by a policy implementing style and exhibits a preference for brevity, while common law exhibits a dispute resolving style [3]. While thinking in terms of a strong opposition between legal families, stemming from [14], has been seriously questioned in recent years (see [12]), the reference structures of national constitutions show a clear distinction between British and French former colonies, which must be seen as a reflection of the common and civil law styles of legal thought. Many former British colonies, for example, have had independence or constitutional conferences in London. British jurists where involved in drafting the new constitutions and hence the texts reflect their mode of legal thought.

3. Hubs and authorities As said, the length of texts can only partly account for the number of references. Hence other reasons must be sought for the countries having a large number of references. We used concepts from Social Network Analysis to investigate the properties of the references structures. For each article in each constitution we calculated the hub score (number of outgoing references) and authority score (number of incoming references). In Table 3 we see the total number of references (column 2), the number of article with incoming references (authorities, column 3), and outgoing references (hubs, column 7), and the mean, maximum and standard deviation of hub (columns 4–6) and authority scores (columns 8–10) per country. Further, the number of Weakly Connected Components (WCC, components of articles connected to each other through references) of the constitutions could be established. These are shown in Table 4 for the top 40 countries, ranked in terms of number of references in descending order. In large components, central hubs and authorities play a role as meta-clausules, which can have a variety of purposes. Belgium is a clear example of a reference structure with central authorities. These involve regulations in terms of the country’s multilinguistic character. (see left side of Figure 1) We checked whether other multi-linguistic countries showed comparable patterns, but could not find these. We think that the reason for this is that often officially recognized languages do not coincide with distribution of executive power. But even in countries in which govermental organization is strongly related to linguistic diversity, such as for example Canada or Cyprus, no authorities similar to Belgium are present. Thailand is a country with many references, but these involve a significant degree of organization in terms of WCCs. Thailand has one of the lowest WCC-ratio’s. This indicates that the number of minute clusters is relatively small and hence can be taken as a measure of better (recognizable) organization structure (See right side of Figure 1). There also are clear hubs at the end of the Thailand constitution involving specifications of articles which are not to be enforced in a period of transition of government. In other countries we find similar such hubs (Botswana, Lesotho). Poland has two clear 6 Length of texts and topic ratios are taken from the constitutionsproject website. For linguistic variation we have only included officially recognized languages

Table 1. Top end of the list of constitutions with references, length, topics, colonization and linguistic diversity Country

# Refs

Length # words

Length-Ref ratio

Topic ratio

Colonizer

# Off. lang.

Papua New Guinea Tuvalu Sweden Thailand

403 403 344 329

58,490 34,801 13,635 44,756

145.14 86.35 39.64 136.04

0.47 0.4 0.61 0.73

GB GB

3 2

St Kitts and Nevis Jamaica Malaysia India Lesotho South Africa Trinidad and Tobago St Lucia

327 293 280 279 267 254 251 248

49,643 42,727 64,080 146,385 45,532 43,062 36,302 38,271

151.81 145.82 228.85 524.68 170.53 169.53 144.63 154.31

0.56 0.4 0.59 0.6 0.56 0.64 0.51 0.61

GB GB GB GB GB GB GB GB

2 23 2 11

Belize Dominica Sierra Leone St Vincent and the Gr. Swaziland Grenada Antigua and Barbuda Barbados

244 231 231 226 224 219 217 204

39,629 36,080 44,636 34,817 48,604 33,737 38,464 34,144

162.41 156.19 193.22 154.05 216.98 154.05 177.25 167.37

0.47 0.54 0.54 0.49 0.66 0.47 0.56 0.41

GB GB GB GB GB GB GB GB

Singapore Nigeria Zimbabwe Mauritius Sri Lanka Fiji Guyana Belgium

194 191 191 178 169 167 164 155

40,076 66,263 39,976 36,333 40,085 40,000 46,221 16,119

206.58 346.93 209.29 204.11 237.19 239.52 281.83 103.99

0.5 0.63 0.61 0.47 0.67 nodata 0.59 0.59

GB GB GB GB GB GB GB

Kenya Botswana Cyprus Gambia German Fed. Rep. Solomon Islands Greece Malawi

152 143 138 135 126 119 110 104

48,818 30,713 36,199 43,465 27,236 31,836 27,177 33,422

321.17 214.77 262.31 321.96 216.15 267.53 247.06 321.36

0.81 0.41 0.53 0.73 0.71 0.63 0.73 0.67

GB GB GB GB GB GB GB

Pakistan Seychelles Bahamas Malta

104 104 100 92

56,240 40,740 41,835 31,820

540.77 391.73 418.35 345.87

0.64 0.63 0.47 0.47

GB GB GB GB

2

4 3 2 3 3 3 2 2 2

2 2 3 2

Table 2. Constitutions of former French colonies Country

# Refs

Length

Length-Ref

Topic

# words

ratio

ratio

Colonizer

lang.

Gabon Tunisia Madagascar

17 14 13

11804 nodata 15759

694.35 nodata 1212.23

0.6 nodata 0.6

FR FR FR

Morocco Senegal Algeria Burkina Faso Haiti Guinea Mauritania Benin

12 12 11 9 9 8 8 7

15897 10866 10038 9000 17423 12707 6997 11386

1324.75 905.50 912.55 1000.00 1935.89 1588.38 874.63 1626.57

0.64 0.53 0.61 0.51 0.63 0.64 0.49 0.56

FR FR FR FR FR FR FR FR

Cambodia Lebanon Mali Niger Central African Republic Chad Monaco Socialist Republic of Vietnam

7 6 6 6 5 5 5 5

8936 6296 7503 14806 10197 11768 3814 11344

1276.57 1049.33 1250.50 2467.67 2039.40 2353.60 762.80 2268.80

0.61 0.46 0.54 0.66 0.64 0.57 0.37 0.51

FR FR FR FR FR FR FR FR

Syria Congo Djibouti Cote DIvoire

5 4 4 2

8154 9970 6666 7897

1630.80 2492.50 1666.50 3948.5

0.63 0.56 0.41 0.5

FR FR FR FR

24 136.4 23 25 20 22 19 18 136.3 132.1 136.121 189 136.2 136.14

67.1 72

61

87

277131 213 250.2 250 275

144

86 UNKNOWN.11

159 140

167.3

236.4235 236.1 236.5

143

276 121

UNKNOWN.9

304280 273 250.1279244.2 262 272

174 180

74.3

81 79 UNKNOWN.2 23

78

142.3

142.1

56

32 22

100

142.2

141

22bis

11bis

83

134

170.2

173

123.2

68.2

143.2

162

178

68 68.3

168bis 121.1 UNKNOWN.8 UNKNOWN.5

46

157bis

4 175

143.3

UNKNOWN.14

123.1

126

UNKNOWN

35 157

167

UNKNOWN.10 115.1

128

135

UNKNOWN.12

94

6 211

288 305.7

242 299

255202

151

1 281

118.1

166.2

163

130.1

232

152

116 137

115.2 121.2

184185

209.5

2004 256 243 231234 204.4206 252 300 247 35 210.3246210 204 210.2 205 209.4 206.1 210.1230.3 204.3 207

Figure 1. Belgium (left) and Thailand (right)

183 182.6

90 136.10 155 147.1 141 151 291.7 147.3 136.8 154 150 148 137 1304 147 287 168 149 142.4 146 163 142 303.3 136.9 305.4169 145 139 303.575 170 167 136.12176 303.4 178 76 86 303.2

305.3

136.15 190 305.5

182.7 216 305.6 209

186 39

295

106.6 265 266 284 267 268

231.2230

177130 135 134

136.13 179

305.1 47

119.5

231.1 48265.4 91

136 166.3

85

124 110

233

63

117

ix

270 271 136.7 164 129 171 172 236.2 261 128 182.8 181 274 109 136.11 298173180 136.5127 102.14 93 153 209.6 174 263 182 251 136.6 98 106 180.1 102.13 94 119.6 106.9 102.3 293 100 91 205.4 240 239 101106.4 197111 115 119 92 106.5102 238 119.4296 114 115.8 174.4120 238.2305.2 112 113 116 118

77.5

77.7

ix.3 UNKNOWN.7 127

167.4

125

80

157126 156

306 219

269 259260.3 250.4 260 264

34

128.2 UNKNOWN.6 169

77.4

195

160 UNKNOWN.13

5

142

103

36

165.1

177

67 70

75

118.2

119

74

77

41 43.2 68.1

159160 182.5 158

82

UNKNOWN.15

165 166.1

# Off.

56

40

84

78303.1 61 44 81

80

308 244 245.2 65 106.7

136.16 291

237 68

2 3 3

2 2

2

Table 3. The number of references and the maximum and mean number of references incoming (authority) and outgoing (hub) of the top 40 countries. Country

Authority Max. Mean

st.dev.

Total

Max.

Hub Mean

st.dev.

1.56 1.74 2.45 2.05 1.6

1.43 1.58 2.08 1.73 1.02

293 26 165 198 16

16 7 16 13 11

1.37 1.52 1.99 1.65 1.78

1.12 1.1 1.75 1.25 1.47

16 17 23 10 10 11 12 8 5

2.22 1.81 1.66 1.66 1.73 2 1.73 1.43 1.49

2.01 1.89 2.04 1.31 1.44 1.81 1.51 0.97 0.83

188 19 17 163 191 110 16 156 157

7 8 9 27 4 12 6 6 11

1.48 1.43 1.51 1.55 1.31 2.25 1.44 1.48 1.47

1.03 1 1.03 2.26 0.62 1.92 0.85 0.9 1.05

15 168 136 145 88 77 68 124

9 6 8 7 10 9 38 8

1.45 1.33 1.61 1.49 2.31 2.51 2.8 1.54

1.07 0.76 1.18 1.01 1.96 1.84 5.34 1.18

146 178 14 143 75 150 80 164

9 11 6 11 14 6 34 4

1.54 1.25 1.54 1.51 2.72 1.29 2.38 1.16

1.12 1.09 0.91 1.15 2.81 0.69 4.73 0.55

178 169 167 164 155 119 110 104

111 74 36 78 43 95 57 75

5 9 27 22 49 6 8 3

1.6 2.28 4.63 2.1 3.6 1.25 1.92 1.38

0.97 1.75 6.36 2.98 8.3 0.69 1.42 0.6

136 119 42 12 76 96 79 85

4 6 18 6 7 3 5 4

1.3 1.42 3.97 1.35 1.65 1.23 1.39 1.22

0.56 0.88 4.36 0.88 1.24 0.53 0.78 0.58

104 104 100

69 51 53

4 6 9

1.5 2.03 1.88

0. 1.44 1.54

74 54 91

7 9 3

1.4 1.92 1.09

0.97 1.61 0.33

# refs

Total

Papua New Guinea Tuvalu Thailand St Kitts and Nevis Jamaica

403 403 329 327 293

257 231 134 159 183

17 13 12 11 6

Malaysia India Lesotho South Africa Trinidad and Tobago St Lucia Belize Dominica Sierra Leone

280 279 267 254 251 248 244 231 231

126 154 160 153 145 124 141 161 155

St Vincent and the Gr. Swaziland Grenada Antigua and Barbuda Barbados Singapore Nigeria Zimbabwe

226 224 219 217 204 194 191 191

Mauritius Sri Lanka Fiji Guyana Belgium Solomon Islands Greece Malawi Pakistan Seychelles Bahamas

hubs which regulate what to do in case of emergency such as natural disaster. In other countries these also involve regulations in case of war or armed conflict. Sometimes central nodes (both hubs and authorities) are related to the political structure of a country. Bhutan for example expresses a clear hierarchical structure starting with the King’s authoritative power. Other clear examples are Mauritius, Uganda and Zimbabwe. 7 In other cases central nodes specify civil rights. But these too, often have the character of metaclausules. That is, these nodes are central because they specify when other articles (tem7 This can be relevant to interpretability because the presence of ’multiple executives’ apparently decreases readability [11]

Table 4. The number of weakly connected components and the wcc ratio Country

Number of WCCs

WCC ratio

Papua New Guinea Tuvalu St Kitts and Nevis

128 84 71

0.44 0.32 0.36

India Trinidad and Tobago Malaysia Swaziland Lesotho Belize Thailand Jamaica

87 80 57 108 83 68 20 68

0.45 0.42 0.3 0.61 0.47 0.4 0.12 0.41

Zimbabwe South Africa Sierra Leone Dominica Singapore St Vincent and the Gr. Antigua and Barbuda Grenada Mauritius

88 77 74 71 40 64 69 58 60

0.54 0.47 0.47 0.46 0.27 0.44 0.48 0.41 0.44

Guyana Sri Lanka Kenya Botswana St Lucia Cyprus Gambia Belgium

44 40 72 66 30 42 53 11

0.36 0.34 0.61 0.57 0.27 0.39 0.53 0.11

Solomon Islands Bahamas Malawi Nigeria Greece Ghana German Fed. Rep Barbados

62 45 54 30 23 52 32 20

0.65 0.49 0.64 0.38 0.29 0.67 0.42 0.27

Kiribati Pakistan Uganda Tanzania

39 47 41 34

0.53 0.64 0.63 0.54

porary) lose their force or in which situations they can be repealed or amended. Thus regulations with overriding force in special situations are the most frequently occuring authorities and hubs and in some cases, take up the bulk of references altogether. Finally we found that path length, which is otherwise a good measure of complex-

ity, could not help us very much in the analysis of the reference structures of national constitutions. While there are many paths of three or more nodes, the problem is that from a single article there is often more than a single possible path to follow. Measuring the total number of paths therefore is not a useful measure for our purposes. Moreover we very rarely find a string of targets which are sources in their own right, which would force the reader to follow a significant amount of references in order to come to a proper understanding of the original source article. Because of these problems we decided to ignore the path length variable, but it is certainly something we have to look into in future research.

4. Reference structures and interpretability of constitutions In the final section of this paper we check how our analysis relates to findings on differences in readability of national constitutions. We defend the following thesis: an increase in number of references can in itself not be taken as a sign for a decrease in interpretability. What matters is recognizability of patterns in the reference structure. Such patterns significantly reduce the demand on mental gymnastics of readers. The patterns are given by (1) presence of hubs and authorities (scale free networks) and (2) component analysis. The presence of lots of weakly connected components indicate less organization. In these cases references are isolated and when this frequently happens we suspect that this is difficult to process for readers. An exact treshold or tipping point in which a text goes from unreadable to readable cannot be given. Hence, because of the gradual nature of this change, transitions in structual organization are best considered as phase transitions. This can be related to the theory of phase transitions in [15]. In the case of reference structures in constitutions, transitions towards more organization must be valued positively in terms of readability 8 We support our thesis by considering the countries that are reported in [11] to score high (Haiti, Thailand, Pakistan) and low (France, India, Mexico, Guyana) on readability and add why Belgium probably scores high and Kenya low.9 Thailand has one of the largest amount of references but both the low WCC ratio (0.12) and the presence of hubs presumably maintain readability. The same holds for Belgium with a WCC ratio of 0.11, the difference being that Belgium’s well-organized referential structure is given by authorities (it has one of the highest indegree means), instead of hubs. A much more even distribution of references we find in countries such as India and Kenya (note that we analyzed a later Kenyan constitution than the one referred to in the quote above). See Figure 2. The references in these networks are so evenly distributed that they closely resemble random graphs. Their loose connectedness makes them score low in terms of textual organization. Kenya, for example, has one of the highest WCC ratios. India and Mexico on the other hand do not fit this picture as the WCC ratios are much lower. However these constitutions are among the longest and they also contain a large number of topics. Hence they score high in terms of scope and detail, and this negatively influences interpretability. 8 But 9 The

this may not hold for other aspects, such as the endurance of the constitution. parser finds no references for Haiti so we have nothing to say about this constitution.

239b 239aa.8 239a 239b.2 240 239b.3

10041013

1002 1014

320.4 335 16

1007 1005 1021 1018

1009

233a.b 233 235

1006 1023

6.2 212 122

UNKNOWN.26 118119 28

108 107.5

56.b

30.2a 87 30.1a

2.3 188 193

99 104

21.2 13.4 169.3 244a.4368239aa.7 4.2 243m.4 239a.2 7.2 312.4

61 361

102 101.3 103.1 185.2189

UNKNOWN 367.1 372 35.b

21.2 43 20.5

170.1 333 332

170

80 371a.2

133.1 132.1 134.1 133.2 134a 137 316.b 139a 132 317.1 317 133 145 145.1 134 135

24.5

255 257.11 256.5

84

134.1 147

98.1 177.2 8

80.b 250 25.3a 75

222 206.2

23a 24.1a

15a 187

101.4 89.1 97 101.2

75.2 76

16a 218.1 21796.3

65 40.1

109.3 122

113.1

224 114.3218

257

255.3 112 111 110.2 4.2 256 250.4 255.1 10

146.1 144 165.3

225

145 96.4

114 109.5

156 152 132.2 154 155

233

219

42 70.1

2.4a 62 63.2

23.1 165

166 167.4

38 81.a

106 20.4a

9.1a 9.2a

146 182 180.8 136.2

88.4 82

160.1 161

246

34.133

42.a 69

106.2204 216.4103

34a 231

8.3a 71

2.2a2.3a 90.2 98 123.1

68.b 60

3.1a 93

23.1a 23.2a 160 168.6

1083.3a 20.3a 171

203.1 147.3 134

245.5 160.4 168

238 137 148.2 239.2

2a 162.4 14.2a 14.1a169

24.2a

202

62.1 63

157

27.3a89

98.2 97.1 90

20.1a 20.2a

27 40.2

41 28.1a

217.2 203 216.3218.2

15.1a 185 15.2a

24.3 11.1a 11.2a

70 42.b 22a 19a

49

17.2 14

138.3 101

141.2

37

36 31

58 132.4 29.b

251 229.3 228.3

140163.3

198 196.1

209 208

99

252.2

201 200 199.4

181.2 1012 243zl.v 243zk 1010 327 96.1 286.3 226.1 194.3 95 243q 366 293.2 UNKNOWN.125 190.3191 32 UNKNOWN.6326 312.3 292 243p.e 342 139 210.1 221376.1 66a 236 371e 366.25 207.1 192.1 226.4 120.1 199 348 15 344.6 357.1 243p.a 376.2217 344.2 356 117.1 374 343 UNKNOWN.210105.3 323a.2 UNKNOWN.3 349 243s 1 231.2 136 371d 53 370.1 357.2 110 344 323b.3 243m 230.2227 238 395 217.1 243d 372.1 224 275 340 338.10 29 15.4 367 246 131 366.24 244a 276.1 143.2 214 180 243t.5 241.2 391 392.3 312a.3 378a 366.18 341 145.4 352 334 181.1 168169.1 363.1 172371f.c 314 324 67 68.2 243d.5 56 243k 143 312a.4 374.1 371f.m 125 62.2 243zf 243o.a 116 22 145.3 243zg.a 91 92.1 270.1271 267.1 373 115 148 378.1 243zh.a 243zq 377 316 UNKNOWN.17 217.b 302303.1 267266.1 378.2 42 303 376 UNKNOWN.2 243.d 73 124218 75 304 243b 241 54 97 270 243m.1 352.3 243p.f 206 273.3 105 301 244 65 64 368.a 162 205267.2 96.2 5 7 194 8.3 30 305 280 6 185.1 92.2100 184 264 9 202 15.5 358.1 31a.1 368.4 81 368.3 UNKNOWN.33 19 109 20.2 81.1 55 13 111 31c UNKNOWN.19 331 2 110.4 3 4.1 14 205.1 31b 204.1 382 329 363 204 203 31a 88 206.1 323b.2329a 329.a328

194.1 80 103.1

193

22 116.1 113 114 114.1 115.1

23920c.1 239aa.1 243l 243zs

1001 1017 1003 1025 1016

1011

31.2a31.7a 31.3a 24a 31.1a

23.3 24 12.2

239ab 239aa 239ab.a

1008

165.5 169.1162

112.2 217.6113 112.1

182.6 182.3 180 182.1

142 148 149.2 146.4 142.1136148.6 146.3

Figure 2. India (left) and Kenya (right)

It is hard to say which structural element sorts the most clear effect on readability. It is however natural to assume that a combination of negative scores on reference-length ratio, wcc ratio, number of references, number of hubs and authorities, etc., suggest with confidence that a text is hard to interpret. Two countries do however score opposite values across the board, to the expected ones. France comes out in the test in [11] as a badly readable constitution. Yet it has a low WCC ratio and fairly high indegree and outdegree values. Moreover the constitution is short and covers only half of the possible topics. Perhaps it is the case that France simply does not have enough references (a total of 52) to justify firm conclusions on interpretability. On the other hand the length-reference ratio is below 200, which indicates an above average frequency of references. For Pakistan the anomalous result can certainly not be explained away on the grounds of having too little references. Still Pakistan has a high WCC ratio of 0.63 and modest outdegree and indegree means of 1.4 and 1.5. It is also one of the longest constitutions and covers 0.64 of the possible topics. On all these scores the constitution of Pakistan should lead to problems of interpretability but the opposite appears to be the case. This is a mystery, which we must leave unexplained.

5. Conclusion We have presented an analysis of the texts of national constitutions in terms of distributed networks of internal references. We find a group of countries with a huge amount of references but also a group of countries having little or no references at all. This distribution appears to reflect the civil law vs. common law distinction as almost all former British colonies have elaborate reference structures while former French colonies do not. References occur for a variety of reasons but when central nodes (hubs and authorities) are

present this is often in the form of a meta-clausule. Such nodes are articles which specify circumstances in which other articles are not effective, or under which circumstances changes to the consitution can come about. Such referential structure cannot easily be reduced, but luckily the virtue of interpretability is promoted, instead of undermined, by the presence of central nodes. Referential complexity is in itself not a sign that legal texts are unintelligble. It depends on the clarity of organization of the references whether this is the case. Countries with a low WCC ratio and with a significant number of hubs and/or authorities continue to score well on interpretability even if they have a large number of references. Our analysis can support most, but not all, earlier findings on readability of the constitutions. We suggest that more research is required on the factors influencing readability of texts to make the picture complete. After all [11] was only based on a number of tests with a group of legal students which is not enough to count as compelling evidence. We would also like to be able to say more about a possible treshold from which the number of references really becomes a relevant factor. Finally it would be interesting to add a diachronic perspective to our referential complexity research and investigate by comparison if, how and why, reference structures of subsequent constitutions of individual countries have changed. As always, the attempt to solve one problem leads to more open questions than one started out with.

References [1] [2] [3] [4]

[5]

[6] [7]

[8] [9] [10] [11]

[12] [13] [14] [15]

Boulet, R., Mazzega, P., Bourcier, D., ’ The Network of French Legal Codes’ in: Proceedings ICAIL09: Proceedings of the 12th International Conference on Artificial Intelligence and Law (New York, 2009) Boulet, R., Mazzega, P., Bourcier, D., A Network Approach to the French System of Legal Codes - Part I: Analysis of a Dense Network, in: Artificial Intelligence and Law 19 (2011) pp. 333-355 Damaska, Mirjan R., The Faces of Justice and State Authority: A Comparative Approach to the Legal Process (New Haven and London, 1986) Dari-Mattiacci, Giuseppe and Guerriero, Carmine, ’ Law and Culture: A Theory of Comparative Variation in Bona Fide Purchase Rules’ in: M. Faure and J. Smits (eds.), Does Law Matter? On Law and Economic Growth Ius Commune Series 100 (Cambridge etc. 2011) pp. 137-154 De Maat, Emile, Winkels Radboud and Van Engels,Tom, Automated Detection of Reference Structures in Law , in: In Legal Knowledge and Information Systems. Jurix 2008: The 21st Annual Conference IOS Press Elkins, Zachary, Ginsburg, Tom and Melton, James, The Endurance of National Constitutions (Cambridge, 2009) Fowler, James H., Johnson, Timothy R., Spriggs, James F., Jeon Sangick and Wahlbeck, P.J., Network Analysis and the Law: Measuring the Legal Importance of Precedents at the US Supreme Court in: Political Analysis 15-3 (2007) pp.324-346 Ginsburg, Tom Constitutional Specificity, Unwritten Understandings and Constitutional Agreement ’ Public Law and Legal Theory Working Papers 330 (2010) Ginsburg, Tom, Elkins, Zachary and Blount, Justin, Does the Process of Constitution-Making Matter? in: Annual Review of Law and Social Science 5 (2009) pp. 20123 Jackson, M.O., Social and Economic Networks (Princeton 2008) Melton, James, Elkins, Zachary, Ginsburg, Tom and Leetaru, Lalev, On the Interpretability of the Law: Lessons from the Decoding of National Constitutions, in: British Journal of Political Science (December 2012) pp.1-25 Pargendler, Mariana, ’ The Rise and Decline of Legal Families’ in: American Journal of Comparative Law 60 (2012) pp.1043-1074 Wasserman, Stanley and Faust, Katherine, Social Network Analysis: Methods and Applications (1994) Zweigert, K. and K¨otz, H. Introduction to Comparative Law 3rd edition (Oxford 1998) Janson S., Luczak T., Rucinski A., Random Graphs (Cambridge 2000)

35 years of Multilateral Environmental Agreements Ratification: a Network Analysis Romain BOULETa,1, Ana Flavia BARROS-PLATIAUb,c and Pierre MAZZEGA c,d a Magellan Research Center, IAE University of Lyon, France b Institute of International Relations, Campus D. Ribeiro, University of Brasília, Brazil CNPq PQ and CAPES researcher c International Joint Laboratory Observatory of Environmental Changes(LMI-OMA), Campus D. Ribeiro, University of Brasília, Brazil d UMR5563 Geosciences Environment Toulouse, University of Toulouse, France

Abstract. With the ratification of Multilateral Environmental Agreements (MEAs) the countries of the international community or of intentional communities - be they political, economic, financial, securitarian or strategic - endow these instruments of international cooperation with significant autonomy. From the 3550 dates of ratification of these MEAs recorded from 1979 to mid-September 2014, we produce a graph whose vertices are the 48 MEAs (ratified at least once) and whose links are induced by the succession of ratifications in time. On this basis we propose a diagnosis on the international acceptance of this type of legal instruments and their vulnerability in a global context that builds on the change in the balance of powers as a result of globalization, the break of the bipolar and then unipolar system, and the rise of new powers. Thus, it appears that a global environmental order has been promoted and implemented with some success in the 90s mainly by liberal Western countries who were then able to lead other countries less likely to bind to the fulfillment of environmental obligations. However, the expansion of this global environmental order now seems frozen, due to the current crisis of multilateralism. The rise of many countries, particularly in the South, whose environmental, political and economic weight grew, confronted with the “stable community” formed in the past 35 years suggests that there is a real power shift in the international arena and consequently, multilateralism needs to reflect this new reality. In other terms, the global environmental order is being slowly reformed. As a consequence, the treaties formed clusters in the past but they did not follow the same pattern since the 21 century began. Keywords. Multilateral Environmental Agreements, graph theory, emerging countries, country intentional community, ratification dynamics, global environmental order.

Introduction The decision of a Parliament to ratify a Multilateral Environmental Agreement (hereafter MEA) depends on many factors, including national and international inter-linkages. Concerning the first level, elite strategies, executive–legislative relations, and political 1

Corresponding Author: [email protected]

pressure from interest groups and public opinion are key to understanding the process of ratification [1, 2]. For the international level, there is a large array of factors to be explored, such as: a State’s desire to participate in a global order of liberal inspiration, constraints derived from its domestic politics, strategies developed to set up a regional or international leadership, and the maintenance of an international reputation [3]. Last but yet important, other countries' pressures exerted through negotiations within formal and informal institutions have to be considered. Certainly, this list is not exhaustive and the relative importance of these criteria depends on each State, evolving with the international political context [4, 5]. In this sense, some MEAs were ratified faster than others, whereas some of them were only ratified by an insignificant number of States. Most of the studies produced, from which we have drawn the conclusions above, focus on States as "agents" observed through their behavior or even their strategies, on the international and national political scenes. These same agents communicate to multilateral agreements an own dynamic - a life, we are tempted to say after J.S. Lantis [1, 6]. If ratified treaties do have a life, then what can be said about their lifetime? The lives of these MEAs have at least five stages where complex negotiations take place, starting with the agenda setting that will lead to the creation of an international legal instrument, through tiring negotiations to establish an acceptable text. Only then the MEA will be opened for signature, with unpredictable ratification success and finally, its potential entry into force. All these stages can be followed on the basis of empirical facts opened to analysis and interpretation, a largely unexplored scientific field. This is the way we begin to follow here, offering an analysis of the history of ratification of MEAs over the period 1979-2014, according to an approach based on graph theory. The main question was how to group the MEAs considering their ratification date and what evidence do the results provide us? From a methodological point of view this study helps to diversify the use of graphs and illustrates their relevance for works on Law and International Relations now. The aim of this research paper is to assess if there is a cluster of ratified MEAs, that is, whether there is a network and how it evolved since 1979. By developing this research, we aim at corroborating the qualitative and comparative analyses of social sciences researchers that the environmental-related multilateralism is in a deep crisis. A graph is a representation of a generic nature which has recently found many applications especially with the study of various social networks and their dynamics [7, 8, 9], and recently with network analysis of citations between legal texts [10, 11, 12, 13]. A thorough study of this type has recently been produced specifically on Multilateral Environmental Agreements by R.E. Kim [14]. Unlike the latter study that links the texts via their cross-citations, we link here MEAs via the ratifications by countries. In some way, the present study is the dual - we give below a more precise meaning to this notion - of the study of the same data that we recently submitted [15] but that focused on State actors and which main conclusions will be recalled here. Section 1 presents the data we use and, to facilitate the reading of this work, fixes our simplified, non-technical use of key terms like “agreement”, “Convention” (with capital c), “State” and “ratification”. Various macro-indicators associated with the statistical distribution of ratifications are also presented. Section 2 presents the graph of the MEAs and the identification of clusters of agreements that emerges from the analysis of this graph. Analysis and interpretation of these clusters are developed in Section 3 with a critical discussion. The conclusion of this work is drawn in Section 4.

1. MEAs Ratification Data and Statistics 1.2 Pace of ratification over the period 1979 - 2014 The data used here are available on the website of the UN Treaty Collection - United Nations, at the Chapter XXVII dedicated to the environment 2 . We use two types of information: a) the date of opening for signature of any "agreement" that serves as a reference date to assess the time until an eventual ratification; b) the date of ratification by individual country (or entity of equivalent status as the European Union). The use we make here of some terms must be specified. Under the term “MEA” we group conventions, protocols, agreements, amendments, treaties, etc. listed on the UN site. When necessary, we make a distinction between the 17 major conventions and protocols under which are possibly gathered other texts to form what we call here a "Convention" (with capital c). For example the “convention 7” (the figures are those used on the UN website) say the United Nations Framework convention on Climate Change (UNFCCC for short) covers a Convention that includes in addition to this framework convention the Kyoto Protocol (7a), the Amendment to Annex B to the Kyoto Protocol (7b) and the Doha Amendment to the Kyoto Protocol (7c) . 300 250 200 150 100 50 0 1975

1985

1995

2005

2015

Figure 1. Time series (diamonds) of the yearly ratifications rate (y-axis) of MEAs. The squares (arbitrarily placed at the top of the figure) indicate the date of the opening for signature of the 17 international conventions on environment.

By "State" we mean any political entity entitled to sign or ratify a MEA. With the term "ratification" we mean actual ratification but also acceptance, approval or accession (though we exclude the succession). Note also that we do not consider any event specific to the life of States, such as State creation or session: the list of countries and the dates of ratification was established in September 15, 2014 from the site of the UN Treaty Collection mentioned above. In total, we identified 48 MEAs ratified by at least one State, 197 "States" that have ratified at least one MEA, and 3550 ratifications in the period 1979-2014. Changes in the annual rate of ratifications are presented in Figure 1. With regard to the gradual 2

https://treaties.un.org/pages/Treaties.aspx?id=27&subid=A&lang=en

establishment of an international environmental order (though partial, see below) based on the MEAs, 1992 is notable because it marks the opening for signature of several major conventions including the UNFCCC and the Convention on Biological Diversity (CBD for short). The 90s saw a real surge of part of the international community in favor of engagement on various environmental issues through MEAs. Then after a significant drop from 1995, the ratification rate rises above the threshold of 200 ratifications per year from 2002 to 2004 and then decreases rapidly again. In sum, these peaks correspond perfectly to two major United Nations (UN) environment/development related summits, the 1992 United Nations Conference on Environment and Development (UNCED or Earth Summit) held in Rio and the 2002 World Development Summit in Johannesburg (WSD or Rio+10 Summit). 1.2 Who ratified which MEAs? The international community is organized around a variety of issues, including establishing political (e.g. African Union, European Union), economic (e.g. ASEAN, G7, MERCOSUR), financial (e.g. G20) or strategic (e.g. NATO) formal institutions 3. Or more recently to promote the emergence of a new global order (e.g. BRICS) in a context of contested USA hegemony, weakened Western liberal institutions and emerging countries’ increasing participation [16]. The reasons that led to the creation of these institutions (that we qualify as being intentional in this study) do not include environmental protection, climate change, nor better management of natural resources, although these motivations are modulated in various ways depending on the international context. Therefore, in order to make a diagnosis on the lives of MEAs, it is interesting to see what are the positions occupied by these international institutions, as a result of the commitments made (or excluded) by their member States. In this research, these institutions did not contribute to the network analyses, since their member States did not necessarily ratify the MEAs at the same lag of time, except for the UE and its members, which show exactly the opposite. Firstly we produce some statistical macro-indicators drawn from the empirical analysis of ratification dates (Table 1). Of course, the intentional communities listed do not form a partition4 of the set of 197 countries that have ratified at least one MEA. These communities are partly overlapping, some countries belonging to several of them. Furthermore, the large numbers of ratifications lead to the strengthening of the regimes complex in the sense employed by Orsini et al. [17]. Therefore we found three broad MEA clusters emerging: a) those having MEAs ratified by a large majority of countries and as such having a significant global normative impact (Conventions 10, 2, 14, 15); b) those with mixed membership at the global scale, about half of the EU countries but also many Southern countries having ratified (Conventions 8, 7, 3, 12); c) a group of Conventions that especially the Southern and emerging countries have not ratified yet. For some agreements the time since their opening for signature is not sufficient to establish a definitive assessment of their relative success or failure (see in particular the 3 Websites describing these institutions and their State members can be easily found on the worldwide web. Like the EU, the MERCOSUR and ASEAN are adopting negotiating agendas that are more political. Although the BASIC Group (Brazil, South Africa, India and China) could be an interesting counter-example, it was not considered because of it is still just a group under disputable institutionalization in the near future. 4 Here the word “partition” is used in a set-theoretic sense: a partition of a set S is a set P of subsets p of S such that the p are pairwise disjoint, the union of all p forms the whole set S, and none of the elements of P is empty.

Minamata Convention on Mercury, code 17, opened for signature in 2013). Indeed the time of ratification reads more than 6.5 years on average (and standard deviation of 4.5 years) for Southern countries. This comprehensive diagnosis, however, tends to stabilize given the strong decline in the annual rate of ratification in recent years. Table 1. Average time lag of ratification (in years) and percentage (number in parenthesis) of country members of various institutions having ratified the MEAs composing the 17 “international environmental Conventions” (see text). The “All” column stands for the international community of the 197 world countries having ratified at least one MEA. AU is the African Union; MERC. stands for MERCOSUL. The Conventions numbered from 1 to 17 (1st column, C) are ranked by decreasing % of countries having ratified the MEAs (2d column; number in parenthesis). The light gray cells indicate an average time of response larger than the median of the reaction times of all countries having ratified. The dark gray cells indicate that countries of the considered community have not ratified any of the MEA of the Convention. C

All

ASEAN

AU

BRICS

EU

G7

G20

MERC.

10

3.6 (99)

4.2 (100)

2.6 (98)

3.8 (100)

4.3 (100)

2.9 (100)

3.3 (100)

2.6 (100)

02

6.9 (98)

7.4 (100)

7.9 (98)

5.4 (100)

3.8 (95)

2.3 (100)

3.9 (100)

4.9 (100)

15

4.2 (91)

4.2 (80)

4.2 (94)

4.5 (100)

3.8 (93)

1.7 (71)

4.3 (90)

3.7 (100)

14

6.8 (78)

9.03 (80)

6.9 (83)

7.2 (100)

4.9 (96)

4.5 (86)

6.0 (90)

6.6 (100)

08

3.8 (56)

4.5 (60)

4.0 (59)

2.7 (55)

3.0 (65)

2.6 (43)

3.0 (52)

3.1 (50)

07

4.3 (55)

5.0 (52)

5.1 (51)

3.6 (75)

3.4 (52)

2.8 (50)

3.6 (60)

2.8 (53)

03

8.2 (45)

8.8 (40)

10.5 (45)

4.4 (40)

5.6 (63)

5.6 (48)

5.9 (50)

6.9 (63)

06

9.2 (21)

-

-

1.9 (20)

9.1 (93)

9.7 (57)

7.8 (30)

-

12

9.9 (18)

17.0 (10)

11.7 (21)

1.4 (20)

10.4 (50)

13.9 (57)

11.4 (25)

-

13

5.2 (18)

-

-

-

5.1 (92)

5.1 (48)

4.8 (22)

-

05

6.5 (17)

-

-

3.8 (20)

5.5 (75)

6.2 (38)

5.7 (22)

-

04

7.0 (15)

-

-

-

6.6 (75)

6.1 (32)

5.9 (16)

-

01

6.0 (14)

-

-

0.8 (7)

5.4 (64)

3.2 (60)

3.0 (27)

-

09

4.4 (4)

-

-

-

4.4 (30)

4.4 (36)

4.4 (12)

-

11

3.1 (4)

-

3.3 (12)

-

-

-

-

-

16

1.1 ( 3)

V S A P D K J N G O sum=

V 1 1 1 1 1 1 1 1 1 1 10

S 1 2 1 1 1 1 1 1 1 1 11

A 2 2 1 1 2 1 1 1 1 1 13

P 1 2 1 1 1 1 2 1 1 1 12

D 2 2 1 1 1 1 1 1 1 1 12

K 1 1 1 1 1 1 1 1 1 1 10

J 1 2 1 1 1 1 1 2 1 1 12

N 1 1 1 1 2 1 1 2 1 1 12

G 2 1 1 1 1 1 1 1 1 1 11

O 1 2 1 1 1 1 1 1 1 2 12

sum= 13 16 10 10 12 10 11 12 10 11

The differences between individual cells in the tables under comparison can be estimated using method from [4, ex.4.4.30, p. 201]. The differences of cells are presented in Table 16 and Table 17. There is number 2 in the third column of the first row in Table 16. That means the difference of the respective cells in Presidential Election Law filtered table and its Citizenship Act prediction table is more than similarity threshold 3. This is in accordance with the analysis of Table 11. There is 2 at the beginning of the fourth row in Table 17. This presents well the analysis of Table 12.

Table 17. Difference of Citizenship Act table from Presidential Election Law by cells (1:≤ 3, 2:> 3)

V S A P D K J N G O sum=

V 2 2 1 2 1 2 2 2 1 1 16

S 2 1 1 1 1 1 1 1 1 2 12

A 2 1 2 1 1 2 2 2 1 1 15

P 2 2 2 2 1 1 2 1 1 1 15

D 2 2 1 1 1 1 1 1 1 1 12

K 1 1 1 1 1 1 1 2 1 1 11

J 2 2 2 1 1 1 1 1 1 1 13

N 1 2 1 1 2 1 1 1 1 1 12

G 2 2 2 1 2 1 2 2 1 2 17

O 1 2 1 1 1 1 1 1 1 2 12

sum= 17 17 14 12 12 12 14 14 10 13

5. Conclusion We studied the structure of legal acts using sequences of consecutive pairs of word types in sentences. The frequency tables of two documents were compared using prediction tables calculated from the kernel tables of documents. We use the chi-square method to estimate the similarity of tables. Due to its asymmetry the chi-square method also tells us whether one of the tables is better predicted by the other one or not. Our results show that the complexity of legal acts can be estimated using the proposed method but it is not easy to find documents of exactly same complexity. On the other hand we can see that the structure of Estonian Constitution as a higher level legal act predicts rather well the structure of lower level constitutional acts.

References [1] [2] [3] [4]

Frederick Mosteller. Association and Estimation in Contingency Tables. Journal of the American Statistical Association 63 No. 321 (1968), 1–28. Ermo Täks. An Automated Legal Content Capture and Visualisation Method. PhD Thesis. Tallinn University of Technology, 2013. TAHMM, Disambiguator of ESTMORF Results. (in Estonian) http://www.eki.ee/keeletehnoloogia/projektid/tahmm/ Rein Jürgenson, Ülo Kaasik, Ivar Kull, Leo Võhandu. Tasks for Programming. Tallinn, Valgus, 1978. (in Estonian)

Nets of Legal Information Connecting and Displaying Heterogeneous Legal Sources Nicola Lettieri°1, Sebastiano Faro*, Luca Vicidomini#, Antonio Altamura# ° ISFOL (Rome) and University of Sannio, Benevento, Italy * Institute of Legal Information Theory and techniques, Italy # University of Salerno, Italy Abstract: The paper presents an ongoing research revolving around the design of an information retrieval tool that combines open dataset analysis, metacrawling and network visualization techniques to produce interactive maps of heterogeneous relevant legal documents connected to a specific piece of legislation (what we call the “Reference Network of the Norm”). We gather references to norms, case law, legal literature and preparatory works related to a given legislative measure and visualize them in their mutual relations (according to the “Query-Browsing” paradigm). Starting from the graph presented as a result, full text of every relevant source can be displayed (according to the “Browsing-Query” paradigm). The goal is twofold: to experiment the potentialities of visualization in legal information retrieval, and to pave the way to quantitative analyses of specific areas of the legal system by means of statistical and network analysis methods applied to different connected sources. The work focuses on the Italian legal order but the approach we propose is a general-purpose one and therefore extensible to other legal frameworks. Keywords: Graph visualization, meta crawling, heterogeneous legal sources, visualization for legal information retrieval. 1. Connecting Heterogeneus Legal Sources: The Lawiz Project Legal regulation of social life is the result of the connection between different sources of law (laws and regulations, case law, legal literature, administrative practices, etc.) which make up a unitary picture. As a consequence, understanding of the legal order, and more particularly of specific sectors inside it, can only be attained by jointly reading all the legal sources related to the theme that is the object of interest. Then, when an endeavour is made to understand whether and how a certain phenomenon is disciplined by law, what is of interest is not a single source but the picture that emerges from the contextual analysis of all the relevant sources of law connected to it: exploring law is almost like connecting dots, bits and pieces, trying to compose them in a unitary picture making sense. In this vein, a tool making it possible to retrieve and collect in a single context all the important sources in relation to a certain topic is a precious ally of the research activity of the legal professional and the layman. Then the help is still greater if modalities of presentation of the contents are used that make it easier to perceive the connections between the sources and the 1

Corresponding author, [email protected]

1

evolution of the latter over time. Interesting, in this perspective, are the forms of visualization made possible by tools for network analysis. In addition to affording a particularly intuitive form of communication, visualization in the form of a graph is marked by a level of ease of interaction that is very useful when one has to handle large quantities of documents. This paper presents Lawiz, an ongoing research project inspired by the growing diffusion, in the legal field, of graph visualization and analysis tools such as Ravel law – a platform for legal research using search, analytics, and visualization2 – and similar applications like those cited by Stevenson & Wagoner, 2013. The project aims at exploring the potentialities of web applications to searching for, visualizing and analysing what here we define the Reference Network of the Norm (RNN), i.e., the set of all the legal sources connected, through various types of relations, with a specific piece of legislation. These sources are (often freely and easily) accessible on line but separated from one another. As a matter of fact, what is missing is the representation of the connections among the sources and the possibility of navigating through them. Our goal is to create an online tool that, starting from the search for a statutory law, will result in an increasingly sophisticated interactive graphic representation of the RNN. In this phase, Lawiz focuses on the Italian legal order taking statutory law as the starting point of the construction of the network. The focus is particularly on state laws (in Italy regional laws also exist), a category of legal source that presents two advantages: it is more easily accessible and is published online with technical solutions and metadata making it easy to identify other relevant sources. It is to be noted that, even if related to the Italian context, the approach used is also generalizable and extendible in perspective to the context of other legal orders. To construct the RNN, information of various kinds is required on the various types of correlated sources. No freely accessible database being available that contains all the data of interest and wanting to construct one, we opted for a hybrid procedure combining metacrawling on online databases and analysis of open datasets. The basic philosophy of the project is therefore characterized by some fundamental choices: 

we use data freely available online



each time we obtain the necessary information directly retrieved “as found and where found”



we create an added value given by reconstruction of the relationships that link a specific law to its RNN and by graphic representation of what is found.

In the following sections, we dwell first of all on the definition of RNN and we list the legal sources and the categories of relationships between them so far considered (Section 2). Subsequently we describe the document sources and the procedure developed to retrieve the information of interest and to construct, visualize and analyze the graph of the RNN (Section 3). There follows a brief analysis of the software architecture and implementation (Section 4). Lastly, we present the first results obtained, and we discuss the development prospects of the research (Section 5).

2

https://www.ravellaw.com/.

2

2. A definition of the “Reference Network of the Norm” The concept of “RNN” includes several categories of legal sources connected to one another, a very large set of documents belonging to different document typologies produced at different levels of government and also by different legal orders, as happens for the 28 EU countries inserted in a supranational organization. The reconstruction of the RNN in all its components is therefore a complex activity. On one side, identification of all the sources connected to a given piece of legislation is a challenging operation from the conceptual point of view requiring broad legal knowledge and indepth researches. On the other side, the retrieval of all the documents that form the RNN raises serious operational difficulties linked to the fact that not all the documents are available online and, when they are available, they are distributed in different databases using different standards. For this reason, wanting at any rate to begin to experiment with the project idea by creating a first prototype of the tool, we adopted an “operational” version of the RNN characterized by some limitations in relation to the “complete” RNN. Specifically: ● we constructed the RNN starting from a law considered as a unique entity (hereinafter “Root” of the network). We know full well that in many cases the elementary unit to be considered for the reconstruction of the RNN is not represented by the law considered as a whole, but rather by internal partitions like the article. This choice depends on the level of coarseness of the available data structure and metadata. ● we limit ourselves to considering the categories of more important documents (i.e., more relevant in the hierarchy of sources of law) and the ones attainable most easily: legislations, Constitutional Court case law, Court of Cassation case law, preparatory works, and legal literature. Such documents give rise to 6 different typologies of relations with the Root (see figure 1 in which the Root is marked as “Law ”). Inside this scheme the Root:  makes reference to or amends other state laws or legislative decrees3 

is cited or amended by other state laws or legislative decrees



is applied or interpreted by the Court of Cassation



is declared constitutional/unconstitutional by the Constitutional Court



is commented by the legal literature



is the result of preparatory works

3

In the Italian legal system, a legislative decree is a decree issued by the government (formally, by the President of the Republic) upon delegation by the Parliament through a law. It has the same value and force as a law issued by the Parliament.

3

Figure 1: Reference Network of the Norm (Lawiz working version) The RNN model just described is concretely realized and analyzed using the document sources and the procedure that we describe in detail in the following section. 3. From Data Gathering to Visualization and Analysis The operational procedure of Lawiz develops in 4 phases, as follows: 1) Query The starting point of the procedure is represented by a query that uses as keywords the reference details of the normative act whose RNN is to be reconstructed (type - i.e., law or legislative decree -, year, number). The query activates the search inside different databases and datasets. 2) Data gathering The databases and the datasets on which the process of metacrawling is activated are the following. Normattiva database. - Normattiva4 is the public database collecting the integral texts of the laws of the state and the acts having the force of law. From querying the database and processing the metadata available in it, the following information is obtained: ● the full text in force of the Root ● reference details and full texts of the norms that modify the Root 4

www.normattiva.it

4

● links to the preparatory works of the Root ● reference details and full texts of the norms cited by the Root ● reference details and full texts of the norms that cite the Root5 For retrieval of data from Normattiva, Lawiz extensively exploits the URN of Normeinrete (Francesconi et al., 2010), a technical standard conceived for identification and retrieval of normative documents in electronic format online. The standard makes it possible to automatically construct, beginning from the reference details of a normative act, the link to the page that contains its full text. Constitutional Court decisions open dataset.- The Constitutional Court makes available in its official site6, in open format and structured in XML, the texts of its decisions (judgements and orders) integrated by a rich series of metadata. Seeking the reference details of the Root inside a specific XML element of the database, the decisions are extracted that have the Root as their object. For each of these decisions the full text and the abstract of the principle of law affirmed are retrieved (to 1 decision there may correspond several abstracts). Court of Cassation case law database - The Italian Court of Cassation has recently made accessible online7 the database of the most recent provisions of the civil section. Through a fulltext search in the database the texts of the judgements that cite the Root are obtained in PDF format.8 Italian Parliament databases - For state laws, the site of the Chamber of Deputies 9 and the Senate site10 give access to cards that sum up the legislative history including references to any of various materials generated in the course of creating legislation. Links allow users directly to access the electronic texts available - such as bills, committee reports, committee hearings, parliamentary debates, and histories of actions taken. DoGi (Legal Doctrine) database - DoGi11 is a reference database created by the Institute of Legal Information Theory and Techniques of the National Research Council containing the references to articles published in Italian legal journals since 1970. From the DoGi database, it is possible to make a search for normative references to the norms commented on or discussed in articles. Although the DoGi documents have a rich series of metadata including abstracts and classification codes, in the current state the data used by Lawiz are limited to bibliographical references. 3) Graph visualization By processing the information collected we construct the graph of the RNN that in the form of a network presents the relations that link the Root and all the connected documents identified through metacrawling. The document units are depicted as nodes while the relations between them are described in the form of arches. 5

For this purpose a full-text query is dynamically constructed using the references of the norm sought as the search key. 6 www.cortecostituzionale.it 7 www.italgiure.giustizia.it/sncass/ 8 In the absence of more advanced information retrieval functions, at present full-text search does not give an absolute guarantee as to the degree of relevance of the sentence with respect to the topics of the law. 9 www.camera.it 10 www.senato.it 11 http://nir.ittig.cnr.it/dogiswish/Index.htm

5

The development of Lawiz and its interface was inspired by two paradigms of visual information retrieval as defined by Zhang (2008, 16): 

The “query searching and browsing paradigm” (QB), which can be summed up in the following terms: “an initial regular query is required to submit to an information retrieval system to narrow things down to a limited search results set, then the search results set is visualized in a visualization environment. Finally, users may follow up with browsing to concentrate the visual space for more specific information.”



The “browsing and query searching paradigm” (BQ), according to which “a visual presentation of a data set is first established for browsing. Then users submit their search queries to the visualization environment and corresponding search results are highlighted or presented within the visual presentation contexts.”

Figure 2. RNN display in Lawiz According to the QB paradigm, the results of the search are translated into a graph at whose centre there is placed the node representing the Root. To the latter there are connected the other nodes. In some cases (norms cited by, modifying and citing the Root) the node corresponds to a single document; in other cases (legal literature and case law), when in the same year several documents of the same kind were adopted, the node corresponds to a cluster identifying the set of documents found with the search. 6

According to the BQ paradigm, each node of the graph constitutes a link to the texts of the documents or to other information associated with them. Specifically, in the case of nodes representing norms cited by the Root, it is possible to conduct an in-depth exploration since it is possible to identify all the norms cited by the norm taken as the starting point of the exploration each time (figure 3).

Figure 3. In-depth exploration 4) Data analysis In the current version, Lawiz already contains a first basic attempt to use the data retrieved for analysis of a quantitative type represented by a graph called “Norm impact meter” (NIM). The NIM, structured according to the scheme sketched in figure 4, provides objective indexes of the impact of the Root on the legal system. Specifically, the purpose of the NIM is to describe in a temporal perspective: 

The evolution undergone by the Root (considering the acts that have modified it).



The overall impact of the Root, which can be partly inferred even through a count of the documents (judgements, doctrine articles, etc.) which concern it year by year. In the present state, the existence of a connection is identified through the explicit citations alone, without entering into the semantics of the content of the documents, which could disclose implicit relations.

The data used for constructing the NIM can be downloaded in the PDF and CSV formats for further processing by the user with tools external to Lawiz.

7

Figure 4. Norm Impact Meter layout 4. Software Architecture and Implementation Our tool can be described as a metasearch engine or federated search engine. Indeed, it makes it possible to perform a query leveraging several search engines, each containing different information about the Root. More technically, Lawiz is a web application capable of running on a free and open source infrastructure (Linux operating system, Apache web server, PHP language interpreter). It does not need to store any kind of data itself, but can rely on an external server it can access to in order to retrieve data. Lawiz also adopts an asynchronous approach to data retrieval, meaning that the application page is displayed as soon as possible to the user without waiting for the external sources to be contacted. This strategy is a well-known practice on web applications and results in improved user interaction and responsivity. We can easily imagine our software architecture as a Model-View-Controller (MVC), although in our code the three components are not explicitly labelled as such. MVC is the de-facto standard for application architecture, developed for desktop applications but nowadays widely adopted for web applications. This paradigm divides a software application into three interconnected layers: the “View” layer is what the user can see and interact with; the “Model” represents the data that the user can manipulate; the “Controller” layer manages updates in the model layer (e.g., data insertion, modification, deletion, etc…) and refreshes the View layer. As one can easily guess, our model consists in a collection of related documents. We designed our tool so that it can be easily expanded with new document sources by developing pluggable wrappers in JavaScript. Wrappers are part of the Controller layer and are called automatically by the tool. In a typical search interaction, the user starts by searching a law. The Normattiva wrapper will retrieve that law and the related ones down to a parameterized depth of search, obtaining a set of norms. Starting from this first set of results, other wrappers can expand the query and integrate the model with their own data. For instance, the DoGi wrapper can search the DoGi database for legal literature, thus enriching the model with new data. Wrappers work in an asynchronous 8

manner, so that while they fetch data from different locations, the results obtained are progressively displayed to the user.

Figure 5. Lawiz Workflow In the best-case scenario, the information we are interested in would be freely available and accessible via a web service offering relevant information in a structured format (maybe through a RESTful API). In our case, while most information is open, there are still documents requiring some form of payment or subscription for full access (DoGi database); moreover, most data are not available in a machine-readable format, but formatted as web page for browsing. While the Lawiz core is data-agnostic, wrappers are free to perform any kind of data manipulation, so it is their task to process data in a format suitable for Lawiz. Typically, a wrapper could read a web page (formatted for browsing, as mentioned), analyze it using an XML/HTML parser, extract relevant data and present it to the Lawiz core. Once the model is built, the user is free to navigate the model and manipulate the view. Lawiz does not impose rules on wrappers’ implementation, so that a developer can bring into Lawiz almost every kind of data related to norms. Client-side, the application and the wrappers are written in JavaScript language. JavaScript is supported by all modern browsers and allows the use of a large number of libraries such as jQuery (the most popular JavaScript library for HTML manipulation), Sigma (graph drawing), jqPlot (charting), and so on. Server-side, Lawiz exposes a PHP proxy that allows the wrappers to fetch data from external sites (normally, asynchronous requests are limited to the same host due to security policies). Wrappers may also use their own PHP libraries.

9

5. Preliminary Results and Future Developments The work carried out so far has led to the creation of the beta version of a site12 that allows one to interactively explore the RNN of laws and state legislative decrees. Currently the Lawiz retrieval process involves around 710,000 documents (as described in detail in table 1). Source

Number of documents

Normattiva

88,336

Constitutional Court

56,518

Court of Cassation

161,269

DoGi

388,819

Preparatory works

16,14113

Total

710,983

Table 1. Number of documents considered by Lawiz The interface layout (see figure 6) is structured into three areas destined, respectively, for visualizing (i) RNN, (ii) NIM, and, in the column to the right, (iii) full-text documents and complementary information. The results so far attained seem interesting from two different points of view. From an application point of view, the project has worked out and experimented with an innovative modality of information retrieval that is useful in application contexts and for users (judges, lawyers, legal practitioners) generally interested in having in an intuitive way an immediate picture of the documentary universe around a law. From a theoretical point of view, we have taken a step towards the creation of tools supporting quantitative study of the evolution of the legal system and single dispositions inside it. To this objective there corresponds not only the NIM but also the possibility of working back up the chain of connections that link the Root to preceding normative acts. It is thus possible to study interesting aspects like for instance the level of complexity of the legal system or the existence of content links between different corpora of norms.

12 13

http://www.isislab.it:20080/mylawiz/ The number shown refers to the number of cards, which in turn refer to various documents.

10

Figure 6. Lawiz: interface layout The future developments we are thinking about move in three directions: 1. enhancing the construction of the RNN, through (i) enlargement of the document base, so as to include in the search other categories of documents, for instance judgements adopted by lower courts or EU legal measures; and (ii) evaluation of semantic elements making it possible to identify content links between different sources going beyond explicit normative references. For this purpose it will be possible to use metadata of a semantic type already currently associated with the documents considered by Lawiz (like the descriptors of the DoGi classification scheme14 or the descriptors of the thesaurus used by Chamber and Senate15). 2. implementing new functions of analysis of specific characteristics of the retrieved documents (e.g., importance) and of their mutual relationships (e.g., pertinence), also employing techniques and metrics of social network analysis (algorithms of clustering and community detection, assigning weights to the ties among nodes)

14 15

http://nir.ittig.cnr.it/dogiswish/consistenze/class2000.php https://www.senato.it/3235?testo_generico=745

11

3. improving the interface for the purpose of facilitating navigation in the RNN and consultation of the documents that make it up. References Bommarito II, Katz (2009), Properties of the United States Code Citation Network. Available at SSRN: http://ssrn.com/abstract=1502927. Bommarito II, Katz (2010), A Mathematical Approach to the Study of the United States Code. http://arxiv.org/abs/1003.4146. Cross, Frank B. et al. (2010), Citations in the U.S. Supreme Court: an Empirical Study of Their Use and Significance, University Illinois Law Review, No. 2 p. 491. Available at: http://illinoislawreview.org/wp-content/ilr-content/articles/2010/2/Cross.pdf. Curtotti, M., & McCreath, E. (2012, October). Enhancing the visualization of law. In Law via the Internet Twentieth Anniversary Conference, Cornell University. Curtotti, M., McCreath, E., & Sridharan, S. (2013, June). Software tools for the visualization of definition networks in legal contracts. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Law (pp. 192-196). ACM. Fowler, J. H. and Jeon, S., “The Authority of Supreme Court Precedent: A Network Analysis,” (Jun 29, 2005) jhfowler.ucdavis.edu/authority_of_supreme_court_precedent.pdf. Francesconi, E., Marchetti, C., Pietramala, R., & Spinosa, P. (2010). A URN Standard for Legal Document Ontology: a Best Practice in the Italian Senate. In Proceedings of LOAIT 2010 IV Workshop on Legal Ontologies and Artificial Intelligence Techniques (p. 53). Geist, A. (2009), Using Citation Analysis Techniques For Computer-Assisted Legal Research in Continental Jurisdictions. Available at: https://www.era.lib.ed.ac.uk/handle/1842/3511 Smith, T., “The Web of Law,” San Diego Legal Studies Research Paper No. 06-11 (Spring 2005). ). Available at SSRN: ssrn.com/abstract=642863. Stevenson, Drury D. and Wagoner, Nicholas J., Bargaining in the Shadow of Big Data (March 7, 2014). Available at SSRN: http://ssrn.com/abstract=2325137 Stevenson, Drury D. and Wagoner, Nicholas J., Lawyering in the Shadow of Data (September 12, 2013). Available at SSRN: http://ssrn.com/abstract=2325137 or http://dx.doi.org/10.2139/ssrn.2325137 Zhang J., Visualization for Information Retrieval, Springer, 2008.

12

Why do you quote me? Citation of Superior Court orders in the Sicilian courts Deborah De Felice1, Giuseppe Giura1 and Vilhelm Verendel2 1

University of Catania, 2Chalmers University of Technology

In the Italian judicial system, the Criminal Cassazione Court (hereafter Superior Court) issues almost 50,000 judgments each year. This huge number of orders allows a strong discretion by the lower courts of first instance and appellate courts in the selection of the types of supreme judgments considered useful for the justification of their decisions. This paper attempts an empirical description of the modalities in which the lower courts overcome the problems related to the overproduction of case law, which is tied to a more limited "local justice" that has passed the Supreme Court’s scrutiny, in point of law. In this paper we analyzed the content of 728 criminal sentences of organized crime serious offences issued by lower-courts in the four Districts of the Court of Appeal in the province of Sicily. The analysis concerned the quotations of judgments by the Supreme Court in the texts of these 728 criminal orders. Once identified, these Supreme Court orders were extracted from the database of the Supreme Court and used to determine which lower court issued their judgments on the basis of and applying those orders as a corroboration of legitimacy. The third step was the comparison the judgment on the merits of the population examined components. Analysing the different degrees of correspondence between the compared judgments we registered a close correspondence in the case where both were related to the same District Court of Appeals; a strong correspondence when they refer to contiguous districts; weak correspondence if they refer to one of the four districts in Sicily; a lack of correspondence if the cases cited were not issued out of the four districts. In the course of our research special cases were also selected in which their strength is that they cannot be ignored or cases that at first sight seem to belong to the fourth type but are indeed attributable to the first type after their reading. The detailed analysis has been made possible by the treatment through the use of an algorithm. For doing so we create an algorithm as follows. The algorithm takes computer-stored plaintext representations of court sentences as input and automatically extracts all legal citations from a corpus of sentences. Moreover, from each sentence we extract the variables of local court district as well as details about the citation (such as year, name, etc.). This makes it possible to analyse the aggregate citation patterns of particular districts and make a comparative study between districts. Using this methodology, we can observe that the different legal districts are characterised by a difference in citation patterns. Furthermore, this difference consists in citing Superior court orders with different origins. Our statistics serve as a basis for further investigation of the local legal culture phenomenon.

Towards a Legal Recommender System Radboud WINKELS1, Alexander BOER, Bart VREDEBREGT & Alexander van SOMEREN Leibniz Center for Law, University of Amsterdam Abstract. In this paper we present the results of ongoing research aimed at a legal recommender system where users of a legislative portal receive suggestions of other relevant sources of law, given a focus document. We describe how we make references in case law to legislation explicit and machine readable, and how we use this information to adapt the suggestions of other relevant sources of law. We also describe an experiment in categorizing the references in case law, both by human experts and unsupervised machine learning. Results are tested in a prototype for Immigration Law. Keywords. Semantic Web, citations, MetaLex, network analysis, cluster analysis, reasons for citing

1. Introduction More and more sources of law are freely available online in the Netherlands, but also in the rest of Europe and the world. Most of the time however, these are stand-alone databases, containing one type of documents, not linked to other sources. For instance the Dutch portal for case law – rechtspraak.nl – contains a (small) part of all judicial decisions in the Netherlands. Case citations in these decisions are sometimes explicitly linked, references to legislation are not.2 From earlier research we know that professional users of legal documents would like to see and have easy access to related ones from other collections. E.g. when we evaluated a prototype system that recommends other relevant articles and laws to users of the official Dutch legislative portal, they told us they would like to see relevant case law and parliamentary information as well [11]. In this paper we present a first step in that direction. The new version of our portal presents relevant case law, given a legislative article in focus for a user, and adapts the ranking of relevant other articles based on the related case law. The idea is that judges in explaining and justifying their verdicts – applying the law in practice – indicate that the sources they cite are somehow related. For that reason, we also investigate the reasons for citing and whether these can be recognized automatically. If so, we could use these to even better suggest relevant sources to (professional) users. The ultimate aim of this research is to build a Legal Recommender System based on content filtering for now (see [9] for more on general recommender systems). Later on we may also include collaborative filtering, i.e. use the actions and behaviour of other users to predict relevant new material. We will first describe how we created the network between case law and legislation, then the prototype recommender system for immigration law and next our attempt at 1

Corresponding author: Radboud Winkels, Leibniz Center for Law, University of Amsterdam, PO Box 1030, 1000 BA Amsterdam, Netherlands; Email: [email protected] 2 Except for a metadata element for recent cases in the header of the document that contains the ‘main’ article(s) for the decision.

classifying the use of references to legislation in court decisions. We will end with conclusions and a discussion of results. 2. Related Work Several researchers have applied network analysis to legal data, but as far as we know only to one type of data at a time: case law as in [2][12], or legislation as in [4][6]. Van Opijnen [8] uses links to legislation in Dutch case law when deciding upon the relevance of a particular case, but not to suggest other relevant sources of law and not as an applied context for legislation. He distinguishes two types of references from case law to legislation: a procedural one and a substantive one. He is only interested in substantive ones. He also notes how often a decision references an article, the hierarchical position of the referred law and the document structure level of the reference (e.g. whether the reference is to a chapter or an article). Zhang & Koppaka [13] discuss reasons for citing (RFC) prior cases in US case decisions: the text area around a case citation. Since the case in focus is citing another case, they compare this text area to the ones in the cited case. We are looking at references to legislation, so a simple text comparison of the two documents will not do. 3. Creating the Network As stated above, the court decisions published at the official Dutch portal do not contain explicit, machine readable links to cited legislation. The texts are available in an XML format, basically divided in paragraphs using tags, with a few metadata elements. The most relevant metadata for our purpose are:  The date of the decision (‘Uitspraakdatum’)  The field of law (‘Rechtsgebied’)  The court (‘Instantie’) We decided to work with a subset of all available data and chose the field of immigration law. The area is large enough, has lots of (recent) cases and we have access to experts at the Dutch Immigration Service (IND). We used the ‘field of law’ metadata element mentioned above for the selection of cases. That resulted in a set of 13,311 documents to work with. 3.1. Locating and resolving references For locating references to legislation we use regular expressions as we have done in the past together with a list of names and abbreviations of Dutch laws [5]. This list also contains the official identifier of the law (the BWB-number), which can be used for resolving the reference later on. We consider high precision to be more important than high recall. Users will forgive us if we miss a reference, but be annoyed by false ones. We evaluated this procedure by checking 25 randomly selected documents by hand. These documents contained 163 references to legislation of which 141 were correctly identified (recall of 87%). There was one false positive (precision of 99%). The references we missed were mostly those to the ‘Vreemdelingencirculaire’ (a lower law that has a different structure than regular ones) and treaties with very long names like ‘Europees Verdrag tot bescherming van de rechten van de mens en de fundamentele vrijheden’3. If we 3

‘Convention for the Protection of Human Rights and Fundamental Freedoms’.

tried to capture this with the regular expression, the regular expression would match too easily, often matching entire sentences where it should have matched only the law. We declared these conventions outside the scope of this experiment. work: article 35

work: article 37

refers to 

realizes 

work: case 1 refers to 

realizes 

realizes 

refers to  expression:  expression: article 35 of  article 35 of  2010‐09‐01  2012‐01‐01 

expression: expression: article 37 of  article 37 of  2011‐04‐01  2012‐05‐05 

expression:  ruling of  2012‐02‐09 

Figure 1: Bibliographic levels of documents and referencing. Article 35 has two expressions, both refer to the work level of article 37. A decision Case 1 (the only expression of the work) refers to article 37, given the date probably to the first expression, but we represent it as referring to the work level (red arrow).

Resolving the references was a bit trickier, since sometimes they used anaphora, e.g. referring to ‘that law’. In that case, the citation was resolved by using the previous law identifier if it existed. We used the same process for resolving ambiguous title abbreviations; e.g. `WAV` is an abbreviation of ‘Wet Arbeid Vreemdelingen’, ‘Wet Ammoniak en Veehouderij’ and ‘Wet Ambulancevervoer’.4 Most of the time the full title is used before the abbreviation is used. Another issue is determining the exact version of the law the case refers to.5 Typically, a judge will refer to the version that is in force at the moment of the decision, but it may also be the version that was in force at the time of the relevant facts, or even sometimes an earlier version of the relevant law, etc. We cannot decide which version is the correct one without interpreting the content of the case. Therefore we decided to resolve the reference to the work level of the source of law, i.e. no particular version (cf. Figure 1). The resulting references are added to the XML of the case law document. 6 The final network of the 13,311 case documents has 85,639 links to legislation (on average 6.5 references per case); the links connect the ECLI identifier7 of the case with the BWB identifier of the source of law (see above). We evaluated the resolving process by checking 250 random ones of all the references found, by hand. Of these, 234 should have been resolved since the other 16 were outside of the scope of this experiment. 198 were resolved correctly (a recall of 85%). We had 10 ‘false’ positives, i.e. references that were declared out of scope, so a precision of 95%. The results were good enough to continue.

4

‘Labour immigration law’, ‘Ammoniac and Livestock law’ and ‘Ambulance law’ respectively. Given our domain, the first one most likely is the correct one, but we want to implement generic mechanisms. 5 Which expression of the work in terms of bibliographic references as used by e.g. CEN MetaLex [10]. 6 These references can be recognized by a ‘metaLexResourceIdentifier’ attribute of the ‘dcterms:reference’ tag and the fact is has a ‘dcterms:string’ as child. 7 ‘European Case Law Identifier’; see Council conclusions on ECLI at: http://eur-lex.europa.eu/legalcontent/EN/ALL/?uri=CELEX:52011XG0429(01)

3.2. Weighing the Network Since case decisions may refer to the same source of law, e.g. an article, more than once, we count the number of references and compute the weight of the link between the case and the article as: Where is the amount of occurrences of a certain reference and is the weight of the edge. The lower the weight, the stronger the impact on the network is. In [11] we described a first version of our legal recommender system. It only presented other legislative suggestions to a user, given her focus on a specific article. These suggestions were based on characteristics of the network of the tax laws used in the study. The system runs on top of the MetaLex document server [3], containing all Dutch regulations – including historic versions – from the official portal wetten.nl in linked data format. Now we include the network of cases referring to legislation as described above. All processing is done in real time. 4. Prototype Legal Recommender System When a user clicks an article, the related case law and legislation is retrieved: 1. The system checks whether the article appears in the case law network. If so, it creates a so-called ego graph, a local network containing all the nodes and edges within a certain weighted distance from the current node [7]. Since the network only contains references from case law to legislation and within legislation, we know that if we take 2 steps – ignoring the direction of the reference – we will end up in a legislative node again. A weighted distance of 2 will give us most of the legislative nodes related to the current node, but even these networks may become rather large to handle for our present prototype, running on a simple machine.8 We start searching with a weighted distance of 0.4 and gradually increase it up to 2.0 until we have a sufficiently large, but still manageable network.

Figure 2: The prototype legal recommender system. The user has article 2a of the Immigration law in focus. Current version is January 2014 (see pull-down menu at left). Relevant other articles are presented in window A (red or dark border) on the left and below that relevant case law.

8

Single core virtual private server - notorious for low performance.

2.

3.

4.

To find relevant legislation, the system also checks whether the current node is in the legislative network of the MetaLex document server [3]. If so, it again creates an ego graph, this time for an unweighted network. To control the size of the graph, we use only references coming from the selected version (expression) of the current node. If we have two local networks, we want to combine them in order to (better) predict the importance of legislative nodes. To do this, we need to assign weights to the legislative graph. We chose the value 0.1 as it allows the legislative network to influence the result but not overrule the case law references.9 Finally, we use betweenness centrality on the combined network to determine the most relevant articles for the current focus. The betweenness centrality of a node is the sum of the fraction of all-pairs shortest paths that pass through that node. One would expect that the focus node has the highest betweenness centrality in the local network, but this proved not always to be the case. The focus node was however always in the top-5.10 The results are shown to the user in the top of frame A of Figure 2.

4.1. A First Evaluation We asked several professional users of the Dutch Immigration Service to use the prototype system and fill in an evaluation form afterwards. Due to the holiday season, only 3 users replied so far, but they were positive. They appreciated the clean and uncluttered interface, indicated that it was easy to understand without help and liked the indication of the number of times a case was referring to an article when you click on a case. They also noted that the inclusion of case law added suggestions of relevant articles that were not available in the previous system. They complained about the slowness of the system and the fact that references to the ‘Vreemdelingencirculaire’ were missing (as we explained above). Table 1: Ten most frequent patterns found by hand in a sample of 30 cases and automatically in all cases11 1 2 3 4 5 6 7

8 9 10

9

Dutch term ingevolge als/in bedoeld(e) in/onder op grond van met/om toepassing van/aan (in...) (is) bepaald in/met strijd(ig) met in de zin van is niet van toepassing schending van zich/met beroep(en) op gelet op krachtens juncto/jo

English translation due to as referred to in because of pursuant to is determined in contrary to in the sense of shall not apply to violation of to invoke having regard to under in relation to

Sample 58 58 32 29 28 17 13 6 4 13 4 12 7

All 28,609 21,564 17,574 15,574 7,893 5,793 3,607 2,818 2,607 1,837 1,765 1,580 783

We use the NetworkX library in Python for this purpose. In fact we use betweenness centrality approximation, which proved to use up to 50% less time without significant changes in the results. 11 Actually number 10 (“juncto”) was lower down the list for all cases. 10

5. Reasons for Citing We examined 30 randomly selected court decisions to identify repeating patterns used in citing legislation. The most frequent used terms were: “due to” (“ingevolge”) and “as referred to in” (“als bedoeld in”), cf. Table 1. We hoped these keywords are somehow related to the reason of citing the legislation. To investigate whether this was the case, the matter was presented to three persons who are experts on this field: Two judges and one person who teaches students how to read court decisions. We had them sort cards on which the keywords were printed on one side and examples of their use in actual cases on the flip side. The two judges were very sceptical and did not think that the keywords would be good indicators for categories. In fact they did not think that such categories existed at all. The third expert was very interested and came up with this categorisation: 1. Keywords that indicate a selection, identifying essential laws for a case. Examples from Table 1 are: “pursuant to”, “because of”, and “to invoke”. 2. Keywords that indicate application of law. Examples are: “due to ...”, “as referred to in...” and “under”. 3. Keywords that indicate a concluding (denying) function; it is an answer to the first category. Examples are: “contrary to” and “by way of derogation from” (“in afwijking van”, not in Table 1). 4. The last category only consisted of the keyword “in relation to”. This is an arguably uninteresting category. The first three categories coincide with steps in the task judges perform when deciding a case: normative assessment [1]. 5.1. Assigning Keywords to References for Clustering For all keywords occurring more than twice, a regular expression was constructed to automatically identify which keyword was present in a certain reference. These regular expressions are stored in a ‘csv’ file and are sorted for frequency like in Table 1. The algorithm processing the references found in the cases described above will greedily choose the first matching regular expression, without considering alternatives. This could result in a skewer distribution than actually is the case, but it is a relatively simple way to label the references. To counteract the unrealistic skewness, the first regular expressions have been made more complex to avoid false positives. In total 152,853 references were found in the domain of immigration law. Of those, 120,855 (79%) were assigned a keyword. 5.2. Other Features for Clustering Apart from the keywords, some other features were identified: Position of the reference in the document: The idea is that different sections of a decision play different roles. Though these sections are not distinguished in a machine readable form, their order remains more or less constant. Taking the relative position of the reference, these patterns should estimate in which section a reference is. In Figure 3 the distribution of the variable is plotted for a sample. It is clear that the reference density is higher in the first section of a court decision. Law-identification number: This feature is directly copied from the original xml data. Type of court: This feature existed in two forms. One in which it is copied from the original xml file; the other is a simplified one that only distinguishes courts from higher

or supreme courts. This simpler version was used because some local courts were very rare.12 Year: The year of the decision, extracted from the original data. It could be that something changed in the reference-structure over the years. It was converted to ‘years ago’ by subtracting it from the current year. Reference url: The actual reference, also copied from the original data. It is much more detailed than the law identification number above. Different sections of a law may play different roles, so this could very well be an important feature.

Figure 3: Distribution of the position of the reference in the court decision. A sample of 100,000 references was used for this plot.

5.3. Clustering References Since we do not have an explicit theory for classifying references, nor a set of labelled data, we decided to apply unsupervised learning to see whether natural clusters exist in the data. Two rounds of analysis have been performed. The first round was more experimental, the second round was used for actual interpretation. In the first round all features mentioned were used except the reference URL. This feature was originally left out to drastically decrease the dimensionality of the input data. This made the results much easier to interpret. In the second round, the simpler version of the ‘type of court’ feature was used. The clustering was performed in Weka 3.6.1113 using expectation maximisation (EM). This algorithm maximises the log likelihood of the data given the model by fitting Gaussian distributions on numeric attributes. It uses discrete estimators for nominal values. Since it is not clear how many clusters we are looking for, multiple numbers of clusters were tried and compared by cross validation. Higher numbers of clusters were penalising, since the log likelihood is guaranteed to increase in this algorithm when the number of clusters increases. This was performed using Weka by setting the numCluster variable to -1. The expectation maximisation algorithm has the advantage that the algorithm itself is comprehensible and its results are also easy to interpret. For the numeric attributes it will return mean (µ) and standard deviation (σ) of the best fitted Gaussian per cluster. For nominal values it will return frequencies that are Laplace corrected (i.e. added one to make dividing by zero impossible). Every instance is part of every cluster with some likelihood. While these instances will eventually be assigned to the cluster with the highest likelihood for that instance, their values will be proportionally distributed according to the different likelihoods. Therefore all frequencies in the results will be greater than 1. To evaluate how much attributes contributed to the formation of the clusters, the 2 ChiSquaredAttributeEval of Weka was used. In this method, the χ -statistic is calculated for every feature. The result is a rank list of the features, telling how much they contribute to distinguishing the classes. 12 13

In fact, most cases were from courts in ‘Den Haag’. See: http://www.cs.waikato.ac.nz/ml/weka/

5.4. Clustering Results We used a sample size of 8,000 for the clustering experiment. One run took approximately ten minutes to complete. This made it possible to experiment with slight changes without it being too time consuming. Since the EM-algorithm is not guaranteed to find a global optimum, the experiment was conducted 5 times with different seeds (for the random initialisation component) and the results with the highest log-likelihood were used for interpretation. This still does not guarantee a global optimum, but this is a limitation of the algorithm that cannot be totally avoided. From the results of the χ2 evaluation (Table 2) it can be seen that ‘reference url’ is the most distinguishing attribute. This fits the idea that different parts of a law are referred to for different reasons. The EM algorithm distinguished 12 different clusters (Figure 4). To be better able to interpret the results, the following steps were taken:  All options of the attributes ‘reference url’ and ‘law identification number’ that occurred less than 12 times were ignored.14  All frequency values (i.e. all except ‘position of reference’ and ‘year’) were normalized by dividing them by the relative frequency of the class and the total frequency of the attribute. Cluster 3 is the largest and contains 30% of all instances. Almost all references (97%) are to the immigration law (‘Vreemdelingenwet 2000’) and 66% of all references to this law are in this cluster. There are no references to administrative law. Table 2: χ2 results of the final round EM clustering with 8,000 instances.

Average Merit σ µ 56,719 28,381 12,786 9,714 1,478 1,365 0

290 188 46 89 25 64 0

Attribute Reference url Law identification number Keyword Position of the reference Type of court Year Instance number

Cluster 1 contains relatively many references to a law concerning relief of asylum seekers (‘Wet Centraal Orgaan opvang asielzoekers’), but also many to immigration law. The references in this cluster appear at the beginning of a court decision. Cluster 2 is difficult to categorize. It contains relatively many references with keywords like ‘due to’, ‘as referred to in’, ‘is determined in’ and variations. It may be the second category that our expert indicated as ‘applications of law’ (Section 5). Cluster 7 contains all references to the Convention for the Protection of Human Rights and Fundamental Freedoms. Most of these references are to article 3, which concerns the prohibition of torture, article 8 (Right to respect for private and family life) and article 5 (Right to liberty and security). Cluster 6 contains relatively many references to the administrative law articles 8.81-8.87 on preliminary provisions (‘voorlopige voorziening’).

14

An arguably arbitrary number based on the number of clusters (12).

Figure 4: Frequency distribution of the clusters found with the EM-algorithm with 8,000 instances.

Cluster 4 contains relatively many references to the Administrative Law, mostly of a procedural kind. Just as cluster 1, most occur at the beginning of the decision. Cluster 5 contains quite a number of references to the Convention on the Rights of the Child. Contrary to clusters 4 and 1, most references in this cluster occur at the end of decisions. Of the remaining smaller clusters it is interesting to note that Cluster 11 contains only very recent cases (µ ‘Year’=1.7 years from 2014 with σ=0.82). Cluster 0 only contains references with the keyword ‘does not apply’ together with art. 6.6 of the Administrative law, which is about whether an appeal can be declared inadmissible. Cluster 8 finally is a very small, but interesting cluster that contains references from an average 13.3 years ago (with a standard deviation of 1.05 years). Note that 14 years ago a new immigration law was adopted. Maybe that considerably changed something in the data as a result of which the EM algorithm needed to assign a separate cluster to it. 6. Conclusions and Further Research We have shown that it works quite well to automatically find and resolve references to legislation in Dutch case law. These references can be used to provide users of the legislative portal with relevant judicial decisions given their current focus and moreover, suggest additional relevant legislative sources. The parser can easily be improved a little and the prototype system will perform much better when running on a proper server. We also did not exploit the network of case law itself this time. In [12] we showed that this can be used to estimate the authority of cases, so if we include this the suggestions of relevant case law should be improved. We also showed that it is possible to categorise references from case law to legislation. Literature and experts did not really give us an indication of the categories we needed to look for, therefore we used a data driven approach. That way we could distinguish 12 clusters as described above. Most of the clusters make sense and are domain specific; the specific (part of) law that is cited is the main distinguishing feature. The keywords we had our experts sort are less important. Maybe only cluster 2 is an example of a more generic cluster that was also indicated by one of our experts. It is important to keep in mind that a data driven categorisation does not have to be natural or acceptable for legal professionals. The question of whether this categorisation it useful for them remains open. In the future we may also decide to use some additional features as:  The relative frequency of the reference in a court decision (what van Opijnen called ‘multiplicity’ [8]).  The hierarchical position of the law, e.g. whether the referred law is a European directive or treaty, or a governmental decree.

 





Document structure level. A lower document structure level (e.g. article or clause instead of a chapter) suggests a more specific reference, which could indicate a different role. Bag of Words, which was actually implemented, but left out due to increased complexity.15 The bag of words could be interesting on both sides of the reference. It could be performed on both the paragraph of the court decision in which the reference occurs, but also on the section of the law the reference refers to. The position of the specific section referred to. The idea is similar to the used position of the reference in the case decision. Laws are drafted in a structured way. The position could indicate the role an article plays in the law and thus a certain role of the reference in the court opinion. An example is that definitions often appear in the beginning of a law. We also intend to use some other data we extract from the cases: An ordered list of all the references in the decision. We are interested whether there is structure in these lists. Do some references always occur before others, etc. Another use of these data is creating a list of n-grams to help predict the next references. E.g. if we have a set of bi-grams and a previous reference we could look up in our bigram set which reference is most likely to come next, enabling us to create an autocomplete system for references.

Acknowledgements Part of this research is co-funded by the Civil Justice Programme of the European Union in the OpenLaws.eu project under grant JUST/2013/JCIV/AG/4562. We would like to thank the legal experts for their contribution and the people of the Dutch Immigration and Naturalisation Service for evaluating the prototype.

References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]

Breuker, J.A. (1993). Modelling Artificial Legal Reasoning. Knowledge Acquisition for Knowledge-Based Systems, LNCS Volume 723, 1993, Springer, Berlin, pp. 66-78. Fowler, J.H., Johnson, T.R., Jeon, S. and Wahlbeck, P.J. (2006). Network analysis and the law: Measuring the legal importance of supreme court precedents. Political Analysis, 15(3):324–346. Hoekstra, R. (2011). The MetaLex Document Server - Legal Documents as Versioned Linked Data, Proceedings of the International Semantic Web Conference (ISWC2011), pp. 128-143. Springer, Berlin. Liiv, I., Vedeshin, A. and Täks, E. (2007). Visualization and structure analysis of legislative acts: a case study on the law of obligations. ICAIL 2007, pp. 189-190, ACM. Maat, E. de, Winkels, R., and Engers, T. van (2006). Automated detection of reference structures in law. In T. van Engers (ed), Legal Knowledge and Information Systems. JURIX 2006, IOS Press, Amsterdam, pp. 41-50. Mazzega, P., Bourcier, D. and Boulet, R. (2009). The network of French legal codes. In ICAIL 2009, pages 236–237. Newman, M. (2010). Networks: An Introduction. Oxford, England: Oxford University Press. Opijnen, M. van (2014). Op en in het Web. Boom Juridische Uitgevers, Den Haag (in Dutch). Ricci, F., Rokach, L., Shapira, B. and Kantor, P. (eds) (2011). Recommender Systems Handbook. Springer Science & Business Media, LLC 2011. Saur, K. G. (1998). Functional requirements for bibliographic records. UBCIM Publications - IFLA Section on Cataloguing, 19:136. Winkels, R.G.F., Boer, A. and I. Plantevin (2013). Creating Context Networks in Dutch Legislation. In K. Ashley (ed). Legal Knowledge and Information Systems. JURIX 2013. IOS Press, Amsterdam, pp. 155-164. Winkels, R.G.F., de Ruyter, J. and Kroese, H. (2011). Determining Authority of Dutch Case Law. In K. Atkinson (ed). Legal Knowledge and Information Systems. JURIX 2011. IOS Press, Amsterdam, pp. 103-112. Zhang, P. & Koppaka, L. (2007). Semantics-based legal citation network. ICAIL 2007, pp. 123-130, ACM.

15

Execution time of the EM algorithm with bag-of-words was more than 24 hours.

A MODEL OF LEGAL SYSTEMS AS EVOLUTIONARY NETWORKS: NORMATIVE COMPLEXITY AND SELF-ORGANIZATION OF CLUSTERS OF RULES Carlo Garbarino Professor of Law, Bocconi University, Milan

The paper draws both on legal theory and network science to explain how legal systems are structured and evolve. The basic proposition is that legal systems have a structure identifiable through a model of them in terms of networks of rules, and that their evolution is a property of their network structure. Section 1 presents an outline of the use of networks in modeling legal systems. and discusses how (potentially unlimited) sets of rules regulating individual situations (“Rules of the Case”) are produced through sequences of so called “production links” which connect different rules and form so called “chains of production”. Section 2 defines a network concept of normative complexity both in respect to statutes and common law cases. Section 3 introduces the concept of “clustering coefficient”, a measure of connectedness of common law cases . Section 4 discusses the main network property of legal systems at the global and local level (the existence of nodes which operate as legal “connectors” and the “self-organization” of Rules of the Case), while section 5 shows that legal systems can be viewed as networks that undergo continuous modifications and discusses how this evolution can be conceived as a network property. Section 6 presents a case for the application of network theory to the U.S. tax system.

1. The use of networks in modeling legal systems. - 2. A network definition of normative complexity (statutes and common law). - 3. The measures of connectedness of common law cases (clustering coefficient). - 4. Legal connectors and self-organization of legal systems. - 5. Network growth and the evolutionary structure of legal systems. 6. A case for the application of network theory to the U.S. tax system

1

1. The use of networks in modeling legal systems A wide body of extant legal research emphasizes that the law evolves in a selective way1, and this paper is aimed at contributing to it by discussing how such an evolution can actually be explained by a network approach. I propose here a simplified model in which the connections between rules are represented by a peculiar type of network that exhibits certain properties and I advance the conjecture that these network properties constitute the actual backdrop of legal evolution and contribute to explain how it unfolds. A network is a set of items, termed “nodes” or “vertices”, with connections among them, termed “links” or “edges”. Thus a network is a topological object consisting of nodes and links, that abstracts away from the details of the represented situation except for its connectivity. There are many objects of interest in natural and social domains composed of individual parts linked together in some way that can be conceived as networks. These networks are currently the focus of active empirical studies and have been grouped by Mark Newman, one of the forerunners of this field, in three areas: technological networks (the Internet, telecommunication networks, power grids, transportation networks), social networks and networks of information (the WorldWide Web, citation networks), and biological networks (biochemical, neural and ecological networks)2. These applications have over the last couple of decades defined the boundaries of “network science”, the interconnected web of research on structures in the natural and social world which has gained the standing of an interdisciplinary explanatory tool that reveals underlying common properties of networks3. To what extent can these network science applications be extended from their native domains to the analysis of legal systems? As a matter of fact technological and biological networks do not actually offer a ready-to-use model to apply the network approach to law 1

In the literature on the evolution of norms there is yet no explicit use of network concepts, except for an approach using complexity theory by J. B. Ruhl, The Fitness of Law: Using Complexity Theory to Describe the Evolution of Law and Society and Its Practical Meaning for Democracy, 49 VAND. L. REV., 1407–90 (1996). Different approaches have been however adopted to explain legal change. A seminal work is by Robert Axelrod, An Evolutionary Approach to Norms, 80 AM. POL. SCI. REV., 1095–1111 (1986). Comprehensive reviews of the concept of evolution in American legal thinking can also be found in Robert C. Clark, The Interdisciplinary Study of Legal Evolution, 90 YALE L. J., 1238 (1981); E. D. Elliott, The Evolutionary Tradition in Jurisprudence, 85 COLUM. L. REV. 38–94 (1985); Herbert Hovenkamp, Evolutionary Models in Jurisprudence, 64 TEX. L. REV., 645–85 (1985), while a debate has unfolded over the last three decades on the evolution of efficient common law – see the volume THE EVOLUTION OF EFFICIENT COMMON LAW (Paul H. Rubin ed. 2007) -. An area of the literature concerning the evolution of norms can be also found in game theory; see, for example: Arnab K. Basu, Notes on Evolution, Rationality and Norms, 152 J. OF INST’L. & THEORETICAL ECON., 739–50 (1996); Jack Hirshleifer, Evolutionary Models in Law and Economics: Cooperation Versus Conflict Strategies, 4 RES. IN L. & ECON., 1– 60 (1982); The evolutionary approach to norms is used also in respect to international law: Francesco Parisi, The Economics of Customary Law: Lessons From Evolutionary Socio-biology, in THE PRODUCTION OF LAW: ESSAYS IN LAW AND ECONOMICS (Bruno Bouckaert & Gerrit De Geest eds., 1997); Bruce L. Benson, Customary Law as a Social Contract: International Commercial Law, 3 CONST. POL. ECON., 1–27 (1992). Finally, from an anthropological perspective, see: Robert Boyd & Peter J. Richerson, The Evolution of Norms: An Anthropological View, 150 J. INST’L & THEORETICAL ECON., 72–87 (1994). 2 MARK E. J. NEWMAN, NETWORKS. AN INTRODUCTION 15 - 106 (2010). For other reviews of these applications of network theory see: ALAIN BARRAT, MARK BARTHELEMY, ALESSANDRO VESPIGNANI DYNAMICAL PROCESSES ON COMPLEX NETWORKS 180 – 293 (2008); MARK E. J. NEWMAN, ALBERT LASZLO BARABASI, & DUNCAN J. WATTS. THE STRUCTURE AND DYNAMICS OF NETWORKS 415-552 (2006); MELANIE MITCHELL, COMPLEXITY: A GUIDED TOUR 247 – 258 (2009). 3 See in general on network science: Steven H. Strogatz, Exploring Complex Networks 410 NATURE, 268-275(2001); MARK E. J. NEWMAN, ALBERT LASZLO BARABASI, & DUNCAN J. WATTS. THE STRUCTURE AND DYNAMICS OF NETWORKS 1-8 (2006), MELANIE MITCHELL, COMPLEXITY: A GUIDED TOUR 227 – 246 (2009) ; ALBERT-LASZLO BARABÁSI, LINKED: THE NEW SCIENCE OF NETWORKS (2002); MARK BUCHANAN, NEXUS: SMALL WORLDS AND THE GROUNDBREAKING SCIENCE OF NETWORKS (2002); DUNCAN J. WATTS, SIX DEGREES: THE SCIENCE OF A CONNECTED AGE (2003); MARK E. J. NEWMAN, NETWORKS. AN INTRODUCTION (2010).

2

because they describe material connections found in the real world (such as for examples connections between computers, electrical switches or neurons), while legal systems basically express relations established among individuals and institutions. By contrast, information and social networks are capable to provide significant insights for legal modeling because they deal with intangible connections akin to those signified by legal systems, i.e. information fluxes or items of data linked together in some way. For example the Web is a network in which the nodes are the web pages consisting of information (text, pictures and so on) and the links are the actual hyperlinks that allow the user to navigate from page to page, while social networks represent all kinds of relations between individual actors and highlight specific connective features of human interaction4. The legal system can thus in theory be viewed as a network that consists of the rules (the nodes) and relations that connect them to one another (the links) and a new body of legal literature is evolving in this direction5. This literature addresses the network structure of common law cases6 or adopts the network approach to shed light on certain legal issues7, but it does not address a central question, i.e. what are the connections between rules (nodes) that should be basically considered when modeling the legal system as a network. This paper endeavours to fill this gap by adopting a an evolutionary network model of the legal system in which the nodes represent enacted rules that are connected by links which represent the production of rules by rules. The model relies on a distinction between primary and secondary rules that can be either general or singular8. Primary rules regulate the conduct of the recipients, while secondary rule are meta-rules, i.e. rules about rules. In the model general primary rules regulate ex ante classes of individual situations; an example of this type of rules is a statutory rule that imposes a duty on multiple recipients. Singular primary rules regulate individual situations, that is the actual behavior of a specified recipient and are denominated here “Rules of the Case”; an example of this type of rules is an administrative or judicial decision that imposes a duty on a specific individual. As to secondary rules, the model adopts a concept of “rules of production”, i.e. those rules that regulate the production of rules by 4

A seminal work in this field is PETER J. CARRINGTON, JOHN SCOTT AND STANLEY WASSERMAN, MODELS AND METHODS IN SOCIAL NETWORK ANALYSIS (2005). 5 Katherine J. Strandburg, Network Science and Law: A Sales Pitch and an Application to the “Patent Explosion”, 4 30 The Berkeley Electronic Press (bepress); Thomas A.C. Smith, The Web of Law, San Diego Legal Studies Research Paper No. 06-11, at: http://ssrn.com/abstract=642863 (Spring 2005). See also: Richard M. Buxbaum, Is “Network” a Legal Concept?, 149 J. INST’L. & THEORET. ECON., 698–705 (1993). 6 Seth J. Chandler, The Network Structure of Supreme Court Jurisprudence, University of Houston Law Center No. 2005-W-01, at: http://ssrn.com/abstract=742065 (June 10, 2005); Frank B. Cross, Thomas A. Smith, Antonio Tomarchio, Determinants of Cohesion in the Supreme Court's Network of Precedents, Science Research Network Electronic Paper Collection at: http://ssrn.com/abstract=924110; James H. Fowler, Timothy R. Johnson, James F. Spriggs II, Sangick Jeon, Paul J. Wahlbeck, Network Analysis and the Law: Measuring the Legal Importance of Supreme Court Precedents, 15 (3) POLITICAL ANALYSIS 324-346 (2007); James H. Fowler, Sangick Jeon, The Authority of Supreme Court Precedent: A Network Analysis, http://jhfowler.ucdavis.edu/authority_of_supreme_court_precedent.pdf. 7 See for example: Andrea M. Matwyshyn, Of Nodes and Power Laws: A Network Theory Approach to Internet Jurisdiction Through Data Privacy, 98 NW. U.L. REV. 493 (2004); Lior Jacob Strahilevitz, A Social Networks Theory of Privacy, 72 U. CHI. L. REV. 919 (2005); Daniel F. Spulber and Christopher S. Yoo, On the Regulation of Networks as Complex Systems: A Graph Theory Approach, 99 NW. U. L. REV. 493 (2005); Katherine J, Strandburg, Network Science and Law: A Sales Pitch and an Application to the “Patent Explosion”, The Berkeley Electronic Press (bepress), http://law.bepress.com/expresso/eps/1028. As to applications to corporate law issues: Claire Moore Dickerson, Corporations As Cities: Targeting the Nodes in Overlapping Networks, 29 IOWA J. CORP. L. 533 (2004); Lawrence Mitchell, Structural Holes, CEOs and Informational Monopolies: The Missing Link in Corporate Governance, 70 BROOKLYN L. REV. 1313 (2005). 8 I have proposed this model in: Carlo Garbarino, An Agent-based Model of Production of Rules by Rules, SSRN (2011), that deals with the specific issues concerning the legitimacy of the model from the perspective of legal theory. The interested reader can find there details on the concepts of rules of production, Rules of the Case, production links and chains of production.

3

the recipients – defined as “agents” - to whom they are addressed. General rules of production are addressed to multiple agents and regulate ex ante classes of individual acts of production of primary rules by them. These rules therefore underlie the production of rules by different kinds of agents, i.e. (i) legislators, (ii) administrative agencies, (iii) courts in civil law systems, (iv) courts in common law systems and (v) self-compliant individuals. By contrast, singular rules of production are those that regulate the individual acts of production of other rules by those agents, for example a judge individualizes the general rule of production (the rule of the binding precedent, or stare decisis) through a singular rule of production that is addressed to her/himself in order to enact an actual Rule of the Case. Rules of production lead to the establishment by agents of “production links” that connect different rules thus forming so called “chains of production”. There are three types of production links that are envisioned here as links of the legal network: 1. production links between a precedent general rule of production and a subsequent singular rule of production, when an agent individualizes the former into the latter; 2. production links between a precedent general primary rule and a subsequent singular primary rule (a Rule of the Case), when an agent derives the latter from the former by applying a statute or regulation to an individual situation; 3. production links between a precedent and a subsequent Rule of the Case, when an agent derives the latter from the former by relying on the binding precedent. Production links in a network representation are directed links because they go only in one direction, i.e. from a precedent and a subsequent rule (and not the other way around). In addition, production links do not create cycles or loops, i.e. they lead to rules (nodes) that are always different form the rules (nodes) from which the link originated. As a result the network of rules connected by such production links is a so called “acyclic directed network”- or more simply a “tree” - i.e. a network in which there are only directed links that do not create loops and that represent irreversible events unfolding over diachronic time9. The concept of chain of production is derived from that of production links, as a chain of production is simply defined here as a sequence of production links connecting different nodes, and thus chains of production represent in network diagram the production of rules by rules over time. In conclusion tree diagrams very neatly represent the evolution of a network of rules over time by looking at the sequence of production links that form identifiable chains of production that ultimately lead to the creation of Rules of the Case regulating individual conducts. The model thus adopts a reductionist approach aimed at prying apart the minimal constituents of the legal system (the Rules of the Case) to understand how they are connected in an evolving network in which rules are produced by rules. The proposition on which this paper is based is twofold: (i) legal systems in a given moment in time have a structure identifiable through a model of them in terms of networks of rules, and (ii) their evolution is a property of this network structure that represents the production of rules by rules represented by production links. The paper argues that legal systems have an “evolutionary structure” which is a network growth property and describes how legal change unfolds within such network-like structure. The reason to apply the network approach to law is thus to shed further light on the evolutionary structure of legal systems and to explain the essential patterns of connection between their essential constituents that lead to normative complexity (this topic will be covered by section 2). Another fundamental intuition of network science that might prove useful in modeling legal change is that the structure of such connections might have a big impact on legal systems as a whole in respect to 9

For these definitions see: MARK E. J. NEWMAN, NETWORKS. AN INTRODUCTION 118 and 127 (2010).

4

self-organization and clustering of rules (sections 3 and 4), as well as in respect to evolution of rules (section 5). Furthermore this whole exercise may contribute to establish the underpinnings for empirical work aimed at defining with exactitude the nature and amount of normative complexity of actual segments of legal systems and in that respect a possible application of this approach to the U.S. tax system is presented here at section 6..

2. A network definition of normative complexity (statutes and common law) Rules of the Case are produced by other singular rules of production and the relationships between all these rules are mapped in the model by production links that result in very numerous chains of production that operate within the legal system over time, leading to a potentially unlimited array of Rules of the Case that are viewed here as the minimal resultant elements of the network of rules. It thus appears that the dynamics of creation of rules involves complex and highly distributed underlying generative patterns and results in the production of atomized Rules of the Case, each regulating a specific individual behavior. A comprehensive view of such generative processes is needed and therefore this section attempts to provide an initial network definition of a concept of “normative complexity” in respect both to the application of statutes and the evolution of common law. There are indeed various types of legal complexities, for example the semantic and syntactic complexities of normative language, or the complexity of internal references among rules, but the concept of complexity proposed here is context-specific and relies on network theory by viewing legal systems as a collection of individualized Rules of the Case resulting from multi-tier chains of production. In particular the concept of normative complexity presented here relies on a wider concept of complexity as a degree of hierarchy initially advanced by Herbert Simon. According to that concept systems can be considered as complex when they are composed of different levels of sub-systems (hierarchy), each of them exhibiting strong internal interactions (neardecomposibility)10. Likewise the legal systems can be considered as complex because they are composed of different levels of sub-systems (hierarchy), each of them exhibiting strong internal interactions (near-decomposibility). As a result normative complexity is viewed here as an essential network property of legal systems viewed as a collection of rules produced by rules. As essential feature of general primary rules (statutory rules and regulations that regulate the conduct of an array of recipients) is that they directly give rise to a (potentially) unlimited set of production links that are created by agents – such as judges, administrative agencies or self compliant individuals – who enact specific Rules of the Case that apply the general primary rules. In particular widespread self-compliance of rules by a multitude of individual recipients creates a massive branching off of production links in the network. Thus in network terms a general primary rule is a node that can be connected by numerous production links to other nodes, the Rules of the Case created by these atomized agents. More specifically normative complexity in respect to the application of statutes is defined here as combined proliferation of both singular rules of production and Rules of the Case enacted by agents. There is proliferation of singular rules of production when the agents (particularly the self-compliant individuals) exercise their normative powers, and then proliferation of Rules of the Case follows as the immediate result of proliferation of singular rules of production. Fig. 1 below represents in network notation normative complexity in a situation in which a general statutory rule is enacted. In the example a single statutory rule (Rule 1) eventually leads to a proliferation of Rules of the Case enacted by different types of agents (Rules of the Case 1 through 10

Herbert Simon, The architecture of complexity, PORCEEDINGS OF THE AMERICAN PHILOSOPHICAL SOCIETY, 106 (&), 1962, 467 -482.

5

n) and is represented by a proliferation of production links. This situation occurs when general primary rules (such as statutes and regulations) are applied to individual situations either through decisions by court or administrative agencies, or through self-compliance by individuals.

Fig 1. Normative complexity in statutory law.

In Fig. 1 above a rule of the current constitution is created (CR). On the basis of the singular rule of production RP1, a general rule of production (RP3, a node) is created. In turn, on the basis of various singular rules of production RP211, various agents (administrative agencies, courts, and individuals) individualize a (potentially unlimited) set of singular rules of production (RP4 through RPn). This is the first feature of normative complexity in statutory law, i.e. the proliferation of singular rules of production brought about by agents, particularly self-compliant individuals. Each of these singular rules of production (RP4 through RPn) then leads to the enactment by an agent of a Rule of the Case and therefore to the establishment, by such an agent, of a production link between a producing rule (Rule 1, a general primary rule) and a produced singular primary rule (Rules of the Case 1 through n). As a result, on the basis of the general rule of production RP3 that is individualized by a set of rules of production (RP4 through RPn), a (potentially unlimited) set of Rules of the Case is created (Rules of the Case 1 through n, nodes) that apply the general primary rule to a (potentially unlimited) set of individual situations. This is the second feature of normative complexity in statutory law, i.e. the proliferation Rules of the Case. In the chain of production represented at Fig. 1 above, Rule 1 is connected to Rules of the Case 1 through n by production links established by individualized rules of production. More specifically, according to each individualized rule of production (RP4 through RPn), the general primary rule is 11

For the sake of simplicity the singular rule of production RP2 is indicated here as just one node, but in reality there is one singular rule of production enacted by each individual agent who then will enact the Rules of the Case.

6

the producing rule and Rules of the Case1 through n are the produced rules. In normative complexity of statutory law the proliferation of singular rules of production leads to a proliferation of Rules of the Case derived by agents from general primary rules. Another important feature of normative complexity in statutory law is evidenced by the diagram at Fig. 1 above. In that diagram there is a number of different chains of production which is equal to the number of Rules of the Case (Rules of the Case 1 through n) as each Rule of the Case is generated by a single chain of production. In Fig. 1 there are n chains of production, each of them resulting in a final Rule of the Case (Rules of the Case 1 through n) regulating individual situations. Each of these Rules of the Case shares a common underlying set of rules connected by production links (CR and RP3) which is replicated each time an individual Rule of the Case is issued. This implies that production pathways that are routinely replicated constitute a generative common core within the legal system. In this network complexity there are different levels of sub-systems constituted by these chains of production (hierarchy), each of them exhibiting strong internal interactions (near-decomposibility) and this appear to be a specific instance of the broad concept of complexity as a degree of hierarchy. As a result normative complexity is considered here as an essential network property of legal systems viewed as a collection of rules produced by rules. Network analysis of cases governed by the rule of the binding precedent (stare decisis) reveals that normative complexity (as defined above at Fig. 1 in the context of legislative production) is also a feature of common law and can be represented by network structures, as in Fig. 2 below. An essential feature of common law cases is that they regulate individual situations insofar as they are Rules of the Cases, but also give rise to a (potentially) unlimited set of production links that are subsequently created by agents – common law courts – who enact specific Rules of the Case derived from previous Rules of the Case. Thus in network terms a common law case is a node (a Rule of the Case) that can be connected by numerous production links to other nodes (other Rules of the Case) subsequently created by atomized agents (the courts). More specifically normative complexity in respect to the application of common law is defined here as combined proliferation of both singular rules of production and Rules of the Case created by a specific class of agents, i.e. the common law courts. There is proliferation of singular rules of production when the courts enabled to create these Rules of the Case exercise their normative powers, and then proliferation of Rules of the Case follows as the immediate result of proliferation of singular rules of production created by the courts.

7

Fig 2. Normative complexity in common law.

In Fig. 2 above a rule of the current constitution is created (CR). On the basis of the singular rule of production RP1, a general rule of production of stare decisis (RP3, a node) is created. Basically according to the stare decisis rule for the holding of a precedent case to be applied to a subsequent case the facts must be the same or similar. In turn, on the basis of various singular rules of production RP212 various courts individualize a (potentially unlimited) set of singular rules of production (RP4 through RPn). This is the first feature of normative complexity in common law, i.e. the proliferation of singular rules of production brought about by courts. Each of these individualized rules of production (RP4 through RPn) leads to the enactment by an agent of a Rule of the Case and therefore to the establishment, by such courts acting as an agent, of a production link between a producing singular primary rule (Precedent 1 through n) and a produced singular primary Rule of the Case (Decisions 1 through n). This is the second feature of normative complexity in common law, i.e. the proliferation of Rules of the Case. Thus on the basis of the general rule of production (RP3, stare decisis) that is individualized by a set of rules of production (RP4 through RPn), a (potentially unlimited) set of Rules of the Case is created (Decisions 1 through decision n, nodes) that apply the precedents to a (potentially unlimited) set of individual situations. In the chain of production of Fig. 2 each precedent is connected to Decisions 1 through n by production links established by individualized rules of production. More specifically, according to each individualized rule of production, each precedent (Precedent 1 through n) is the producing primary rule and Decision 1 through n are the produced primary rules (Rules of the Case). This simple tree diagram shows how a line of cases can proliferate because of the normative complexity 12

For the sake of simplicity the singular rule of production RP2 is indicated here as just one node, but in reality there is one singular rule of production enacted by each individual agent who then will enact the Rules of the Case.

8

of the underlying network structure. In normative complexity of common law the proliferation of singular rules of production leads to a proliferation of Rules of the Case (binding precedents) and that, in turn, leads to incremental proliferation of Rules of the Case (subsequent cases) derived by agents from the binding precedents, and so on, with pervasive cascading and path-dependency effects.

3. The measures of connectedness of common law cases (clustering coefficients) Normative complexity in common law exhibits additional network features related to the connectedness of cases measured by a so called “clustering coefficient”. To discuss this issue let us take just three decisions of a court that operates in a common law system under the rule of binding precedent and assume that Decision 1 (binding precedent) is cited and confirmed by both Decision 2 and 3, and Decision 2 (binding precedent) is cited and confirmed by Decision 3. The network representation on this situation is the following.

Fig. 3. A simple network of Rules of the Case in common law.

In Fig. 3 above it is assumed that there is no statute (general primary rule) to be applied by a court and that a court initially decides a case not relying on precedents (Decision 1), while other two courts rely on such Decision 1 as a binding precedent (Decisions 2 and 3). On the basis the general rule of production of stare decisis (SD) two individualized rules of production (RP2 and RP3) lead to Decisions 2 and 3. In Fig. 3 the production links connecting the different Rules of the Case form a directed acyclic network which has a certain density, measured by a clustering coefficient, obtained by dividing the number of links between the singular rules by the number of all possible links13. In general the local clustering coefficient in a network with n nodes is the fraction between the number of the existing connections between nodes and the number of all possible connections between nodes. The clustering coefficient for a fully connected network is one. For example the clustering coefficient of the cluster made by Decisions 1, 2 and 3 in Fig. 2 is one because the number of directed links between these singular rules (3) is equal to the number of all possible directed links (3). 13

On the clustering coefficient: MARK E. J. NEWMAN, NETWORKS. AN INTRODUCTION 261-265 and 310-315 (2010); STANLEY WASSERMAN – KATHERINE FAUST, SOCIAL NETWORK ANALYSIS 598 – 602 (1994). See also: Duncan Watts & Stephen Strogatz, Collective dynamics of “small-world” networks, 393 NATURE 440 - 442 (1998); Romualdo PastorSatorras, Alexei Vazquez & Alessandro Vespignani, Dynamical and correlation properties of the Internet, PHYS. REV. LET., 8725:8701, U188-U190 (2001); MARK J. NEWMAN – ALBERT LASZLO BARABASI – DUNCAN J. WATTS, THE STRUCTURE AND DYNAMICS OF NETWORKS 287 (2006).

9

The clustering coefficient among common law cases of Fig. 3 shows the level of connectedness between them and implies an underlying legal semantics in terms of robustness of a common law trend based on precedents. In qualitative terms the high clustering coefficient of Decisions 1, 2 and 3 represents the fact that, in the example, these decisions form a dense and homogeneous cluster of case law in which each previous decision is cited by subsequent decisions as legal precedent. A level of connectedness of rules is also present in normative complexity of statutory law, although to a lesser extent than in common law. In network terms a general primary rule (such as a statutory rule or a regulation) is a node that is connected by numerous production links to other nodes, the Rules of the Case created by atomized agents such as judges, administrative agencies or self compliant who apply that statutory rule or regulation. The clustering coefficient of this dense web of links connecting one single node (the statutory rule or a regulation) to many other nodes (the Rules of the Case created by agents) is obtained by dividing the number of directed links between those rules by the number of all possible links. In that situation there are links between the primary rule and each Rule of the Cases that is enacted by agents who apply the primary rule, but there are no links among the Rules of the Cases and therefore the clustering coefficient can never be equal to one (even if it approaches one with a very high number of directed links). The consequence is that in the application of primary general rules by agents there cannot be a fully connected network as it may sometimes occur in common law. Recent literature concerning applications of network theory to common law has developed various measures to represent the cohesion of lineages of cases rendered under the stare decisis rule. Smith for example uses regression measures to examine the determinants of cohesion in the Supreme Court’s reliance on precedent in the network of U.S. Supreme Court precedents and highlights that the magnitude of ideological decision-making is consistently associated with a reduction in network cohesion14. In another paper Seth Chandler has specifically analyzed the network structure of Supreme Court jurisprudence in the United States15. This area of research on the authority of the Supreme Court precedents is completed by two recent papers by James Fowler et al.. The two papers analyze two slightly different data set containing the majority opinions written by the U.S. Supreme Court and the cases they cite from 1754 to 2002 in the United States Reports and describe a method for creating authority scores using the network data16. A significant shortcoming of this line of research is that it refers to links representing mere citations among cases which are not evidence of actual production of cases through cases and therefore do no amount to production links in the network sense adopted in this paper. Thus an important clarification must be made: network evolutionary analysis of cases cannot be generically based on citations between cases, but must be based specifically on the retention by subsequent cases of the holding of the previous case/cases, as a production link can be established only if the holding of previous cases is confirmed. Evolutionary analysis of cases must therefore be context-oriented to define exactly which cases are confirmed in the holdings of subsequent cases. Not all citations represent reliance on authority: courts cite precedents just to mention them in passing or because they disagree; often cases are cited 14 Frank B. Cross, Thomas A. Smith, Antonio Tomarchio, Determinants of Cohesion in the Supreme Court's Network of Precedents, Science Research Network Electronic Paper Collection at: http://ssrn.com/abstract=924110. 15 Seth J. Chandler, The Network Structure of Supreme Court Jurisprudence, University of Houston Law Center No. 2005-W-01, at: http://ssrn.com/abstract=742065 (June 10, 2005). 16 Fowler, James H., Timothy R. Johnson, James F. Spriggs II, Sangick Jeon, Paul J. Wahlbeck, Network Analysis and the Law: Measuring the Legal Importance of Supreme Court Precedents, 15 (3) POLITICAL ANALYSIS 324-346 (2007); James H. Fowler & Sangick Jeon, The Authority of Supreme Court Precedent: A Network Analysis, http://jhfowler.ucdavis.edu/authority_of_supreme_court_precedent.pdf.

10

to build an argument leading to a decision which is not similar or related to such cited precedents. In other situations precedent cases are cited expressly to overrule them or to distinguish the facts, thereby eliminating rather than maintaining production links. In general, precedents are cited within elaborate legal reasoning geared or based on such cases, but without any evident link between those precedents and the decision in the case at hand. In other situations precedents are cited in respect to their obiter dicta rather than to their ratio decidendi, and in some situations very remote and important cases on which a decision is based are not even cited because it is obvious that they constitute a binding precedent17.

4. Legal connectors and self-organization of legal systems Section 2 has defined normative complexity as proliferation of singular rules of production and Rules of the Case when there is significant amount of legislation applied by agents, and also showed that in common law normative complexity implies the existence of networks of rules with varying cluster coefficients measuring different levels of internal cohesion. Thus legal systems viewed as networks are a complex combination of intertwined web of rules produced by a vast array of agents. These agents create a massive number of Rules of the Case knitted together by numerous production links, leading to various clusters of highly connected rules. Statutes create a dense web of links connecting general to singular primary rules (judicial application, enforcement and selfcompliance). Common law operates as a clustered network in which the nodes are not all uniformly linked to each other, but rather, some are densely connected to each other in clusters, and these clusters are loosely connected to one another. The law viewed in such a way as a network of rules exhibits complex collective behaviour because it consists of a large network of individual components (rules) each typically following relatively simple rules of production, with no central control. The law also involves signalling and information processing because it clearly involves information and semantic contents. Finally the law is adaptive because there are evolutionary processes that occur through the dynamics of chains of production of rules. As a consequence the law can be defined as a complex system in which network components – the rules - with no central control but subject to simple rules of production give rise to complex collective behaviour, sophisticated information processing and adaptation via evolution. Complex legal systems can then be viewed by looking at their network properties both at global and local level. The network properties at global level are the constraints imposed by production links between rules established by singular rules of production enacted by agents who individualize general rules of production (these concepts have been introduced at section 1). These constraints engender normative complexity both in statute and common law (as defined above at section 2) that activates a selection process in which certain chains of production enact more rules than others, as clearly more production links lead to an increasing number of production links through a kind a evolutionary cascading effect. Network properties at local level are the features of the resulting structure of clusters of Rules of the Case created through production links that have varying levels of clustering coefficients (see section 3 above): higher local clustering coefficients increase the likelihood that consistent rules are incrementally created ensuring evolutionary predominance at local level of certain normative solutions over others. In conclusion the network properties of the legal system at the global level are all-pervasive and dictate how the system works, while the network properties at the local level define its distributed characteristics. More specifically legal systems at global level exhibit so called “power law” 17

On implicit or mute law, see Gerald Postema, Implicit Law, 13 LAW AND PHILOSOPHY 361 (1994); Rodolfo Sacco, Mute Law, 43 AM. J. COMP. L. 455 (1995).

11

network properties, i.e. the existence of nodes with a high number of links, while at local level there are features of distributed “self-organization” of rules. Let us start with the concept of power law properties, and then deal with self-organization features. The distinguishing network property of actual legal systems at the global level is the existence of a power-law feature because in in those networks of rules there is a high number of Rules of the Case and a relatively small number of main nodes (rules) generating such Rules of the Case, as well as a high density of production links around these main nodes, which thus operate as “connectors”18. It should be recollected that a foundational distinction in network science is between random and scale-free networks: a random network is one in which each node has an average number of links (and this is described by a normal distribution), while a scale-free network is one in which no node can be said to have a typical number of links (and this is described by a highly skewed power law distribution)19. By transposing the network concept of the connector into the legal context one can say that a connector is a node (a rule) which has a number of production links (which creates a number of Rules of the Case) that exceeds the average number of rules which are produced by the other nodes (the other rules). At the current stage of research exact empirical data are still lacking on this power-law feature of legal systems and thus it can only be inferred that there are three main different types of connectors: (i) the general rules of production, (ii) statutes and regulations (these are primary rules), and (iii) cases in common law (these are Rules of the Case, i.e. singular primary rules). In the first place, general rules of production are big connectors because they basically enable different agents to establish production links between these connectors and other singular rules. For example the stare decisis rule is connected by directed links to the singular rules of production enacted by individual judges when they decide cases based on precedents, and therefore such a rule is represented by a single node positioned in the top echelons of the hierarchical structure of a legal system (a producing rule) that generates a vast number of singular rules of production (produced rules). The jurisdictional rule that attributes to judges and administrative agencies the power to apply statutes and regulations to individual situations is equally positioned in the top echelons of the hierarchical structure of a legal system and connected by directed links to the singular rules of production enacted by individual judges and agencies when they apply the statutes. Equally the rule according to which individuals have the power to apply statutes and regulations to themselves is equally connected by directed links to the singular rules of production enacted by self-compliant individuals. The general rules of production belonging to the different types described above are represented by a node (a general rules of production that is producing rule) that yields a vast number of production links leading to other nodes (singular rules of production that are produced rules).

18

On the connectors or hubs in networks: Reka Albert, Hawong Jeong, Albert Laszlo Barabasi, Diameter of the World Wide Web, 401 NATURE 130-131 (1999); Jon M. Kleinberg. Authoritative sources in a hyperlinked environment, 46 JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY, 604-632 (1999); MARK J. NEWMAN, ALBERT LASZLO BARABASI, & DUNCAN J. WATTS. THE STRUCTURE AND DYNAMICS OF NETWORKS 335 (2006). 19 The mathematical theory of scale-free networks has been initiated by Leonhard Euler (for an historical account see Norman L. Biggs, E. Keith Lloyd, Robin J. Wilson, GRAPH THEORY 1736 – 1936 (1976), and then developed by Paul Erdos, Alfred Reny and others. For an account of the latest developments see MARK J. NEWMAN – ALBERT LASZLO BARABASI & DUNCAN J. WATTS, THE STRUCTURE AND DYNAMICS OF NETWORKS 9-155 (2006). Recent contributions are: MARK E. J. NEWMAN, NETWORKS. AN INTRODUCTION 247-260 (2010), Albert Reka, & Albert Laszlo Barabasi, (2002) Statistical mechanics of complex network, 74 REV. MOD. PHYS., 47-97; Albert Reka, & Albert Laszlo Barabasi, Emergence Of Scaling In Random Networks 286 SCIENCE, 509-512 (1999); MELANIE MITCHELL, COMPLEXITY: A GUIDED TOUR 258 – 272 (2009).

12

In the second place, statutes and regulations are also connectors when they are massively applied by different sorts of agents (judges, administrative agencies, self-compliant individuals) because in those cases the number of production links originating from a statute (or regulation) is equal to the number of Rules of the Cases enacted by such agents when they apply a statute (or regulation) to individual situations. In network terms a statute is thus represented as a node with a multitude of production links leading to numerous other nodes that are Rules of the Case. Thus change of law through statutes and regulations may trigger chains of production eventually resulting in Rules of the Case that are derived from the newly introduced legislation, either through subsequent decisions of administrative agencies and case law, or through self-compliance by individuals (see Fig. 1 above). Leading cases in common law are the third important class of legal connectors because numerous Rules of the Case are created on the basis of the binding precedent in a strikingly important pathdependence pattern. This is highlighted by the notation technique adopted at Fig. 3 above to represent a network of cases in common law. The binding precedent rule of common law implies that each case consists of a backward-looking constraint to connect the case to previous cases in order to reproduce in the ruling of those previous cases (previous nodes of the network), as well as a forward-looking effects which connects that case to future cases which shall be decided (future nodes of the network). A leading case is a Rule of the Case that is connected by production links to the subsequent decisions enacted by individual judges, and therefore such a leading case is represented by a node (a Rule of the Case that is producing rule) that yields a vast number of production links leading to other nodes (Rules of the Case that are produced rules). In opposition to the legal connectors, most of the nodes (rules) of a legal network have a very limited amount of production links (i.e. create a limited number of other singular rules). For example most statutes have a circumscribed scope and therefore do not create a significant number of Rules of the Case. The same power law distribution implies that in common law most cases are not invoked as binding precedents and, from a network perspective, are thus terminal nodes that do not incrementally evolve into Rules of the Case. In conclusion, in actual legal systems only a few nodes (rules) have a very high number of production links (i.e. create a high number of Rules of the Case), so it can be conjectured that legal systems tend to be scale-free networks with significant clumps of density of Rules of the Case. The coexistence of a small number of legal connectors and a large number of normal rules sheds lights on another striking feature of legal systems, i.e. their network resilience to the deletion of nodes (rules). If normally distributed rules (those with a limited amount of production links) are deleted form the large scale-free legal network, the network’s basic properties do not change as it will keep its heterogeneous degree distribution. The reason is that on average nodes are overwhelmingly likely to be low-degree nodes, since these constitute nearly all nodes in the legal network. However this legal resilience does not emerge if one or more of the important legal connectors are eliminated, as in these cases the legal network loses its scale-free properties. Such features contribute to explain the fact that legal systems do not suddenly collapse in respect to their ordinary features, except when the most important legal connectors are eliminated during constitutional transitions. While the feature of legal networks at global level is a power-law distribution of links generated by legal connectors, the distinguishing network property of legal systems at the local level is “selforganization”. Self-organization results from the interdependence of constituent elements of systems that thus exhibit a rather complex behaviour that is not determined by centralized inputs or

13

commands20. In respect to legal systems represented as networks, complex behaviour specifically identifies the outcome of the operation of self-organization of rules (nodes) connected by production links and emerges both in statute and case law when there is normative complexity (combined proliferation of singular rules of production and Rule of the Case), as in that situation there can be generation of rules irrespectively from the intentionality of individual actors. This is a subtle and decisive point: intentionality of actors is not denied by the network properties of legal systems, yet it is constrained by those properties. Each individual agent exercises a form of discretion, but the legal effects of the actions taken by these actors in the aggregate amount to properties of the network21. Thus the likelihood that a new node (a Rule of the Case) is created is critically affected by the network property of the system: for example in common law the likelihood that a certain decision is enacted is influenced by the number and relevance of binding precedents as well as by the distribution of judicial powers among courts. As Kathrine Strandberg correctly notes, an important lesson from network science is “the importance of network structure in determining collective behavior, such as social norms and behavioral regularities, and flows of information, influence and other goods”. This author also observes that “legal analysis often conceptualizes individual legal actors as responding independently to legal rules in the context of global average social forces. Network science demonstrates that collective behavior may be determined not only by the average impact of global social forces but also by the specific network structures by which these social forces are mediated. Social network studies demonstrate, for example, that access to information depends on one’s position in the network”22. Simple interactions such as the creation of Rules of the Case by individual agents, can thus generate an emerging self-organization behaviour of the legal system as such23, as these agents operate within network constraints defined by the chains of productions of rules. The result is an emerging normative complexity of the system that implies a kind of self organization which is, to a certain extent, independent from the deliberate will of individual agents. In legal systems there are three main types of self-organization at local level: (i) the behavior of courts under the stare decisis rule, (ii) self-compliance by individuals and (iii) enforcement of statutes by agencies. Self-organization occurs in common law when there is a high clustering coefficient among precedent and subsequent decisions, as in these situations subsequent judges are poised to follow previous decisions. This phenomenon is captured in network terms by production links between cases measured by high clustering coefficients. Self-organization is also evident when there is a high level of spontaneous compliance of statutes by individuals as in that situation a general primary rule leads to a kind of synchronized behavior of individuals, each of them self-enforcing with proper Rules of the Case the general primary rules of the statutes (each Rule of the Case is a production link) and also in this case there are high clustering coefficients that increase when the 20

On self-organization and the resulting complex behavior there is a wide literature, see in general: JOHN H. MILLER & SCOTT E. PAGE COMPLEX ADAPTIVE SYSTEMS: AN INTRODUCTION TO COMPUTATIONAL MODELS OF SOCIAL LIFE 45 – 53 (2007); PAUL KRUGMAN, THE SELF-ORGANIZING ECONOMY (1996); CLAUDIUS GROS COMPLEX AND ADAPTIVE DYNAMICAL SYSTEMS 99 – 128 (2008); SCOTT KAUFMAN, THE ORIGINS OF ORDER: SELF-ORGANIZATION AND SELECTION IN EVOLUTION (1993); SELF-ORGANIZATION OF COMPLEX STRUCTURES: FROM INDIVIDUAL TO COLLECTIVE DYNAMICS (Frank Schweitzer, Editor, 1997); PER BAK, HOW NATURE WORKS: THE SCIENCE OF SELF-ORGANIZED CRITICALITY (1997). 21 On the constraining properties of complex systems: YANER BAR-YAM, DYNAMICS OF COMPLEX SYSTEMS (1997); MURRAY GELL-MANN, HOW ADAPTATION BUILDS COMPLEXITY (1996); MELANIE MITCHELL, COMPLEXITY: A GUIDED TOUR 3 – 14 (2009). 22 Katherine J, Strandburg, Network Science and Law: A Sales Pitch and an Application to the “Patent Explosion”, 25, The Berkeley Electronic Press (bepress), http://law.bepress.com/expresso/eps/1028. 23 The creation of complexity from the combination of simple constituents is beautifully exemplified by STEPHEN WOLFRAM, A NEW KIND OF SCIENCE 105-108 (2002).

14

number of self-compliant individuals increase24. Self-organization finally may emerge in bureaucratic action that enforces statutes, as in network terms this action is represented as decisions enacted by agencies (Rules of the Case) connected by production links to subsequent decisions (other Rules of the Case). In that network expansion previous decisions by agencies influence following decisions and form coherent policies that lead to a situation a path dependency in which agencies follow their own previous policies25. Also in this situation each decision is a Rule of the Case and there can be high clustering coefficients akin to those of common law. It is interesting to note that self-organization emerges in those sections of the legal network that exhibit power law features, i.e. where certain nodes (general rules of production, statutes that are complied with directly by individuals or massively enforced by agencies, and leading cases) generate a significant amount of production links that lead to singular rules (either singular rules of production or Rules of the Case). In conclusion legal systems represented in network diagrams have a twofold property: (i) they generate a huge number singular rules (both singular rules of production and Rules of the Case) exhibiting self-organization properties that are connected with power-law features in the distribution of production links, and (ii) they appear as an interlocked and interdependent web of singular rules of production and Rules of the Case with varying clustering coefficients. Although few data are available as to the number of singular rules of production and Rules of the Case of actual legal systems, when legal systems are viewed “bottom-up” in this network perspective clearly their most prominent features are that they operate as networks and that they exhibit complex behaviour together with self-organization.

5. Network growth and evolution of legal systems Previous section 4 has introduced the concepts of self-organization which occurs around the legal connectors in the diachronic plane (i.e. in different moments of time). This section goes on to discuss how legal systems operate in the diachronic plane by showing that they exhibit an evolutionary structure which can be represented as a property of the network. This discussion on the evolution of legal system is loosely based on the concept of “network evolution” that is pivotal in network science because networks are intrinsically dynamic and cannot be studied as static structures. Generally speaking network evolution involves the analysis of how nodes get connected in processes of growth of networks26, for example when a telecommunication system is modified. In this process new links between nodes are created and other are modified in a process involving a massive number of nodes, and therefore one of the main issues of current research is to understand how this occurs. Network scientists have used the concept of “preferential attachment” and showed that this generally leads to a highly heterogeneous, scale-free, distribution in which some nodes potentially acquire a very large number of links in a process of self-organization, while others remain relatively unconnected (this is the so called rich-get-richer effect)27. This preferential 24

On the synchronized behavior see STEVEN STROGATZ, HOW ORDER EMERGES FROM CHAOS (2003). On path-dependence: Paul A. David, Why Are Institutions the “Carriers of History”? Path Dependence and the Evolution of Conventions, Organizations, and Institutions, 5 STRUCT. CHG. & ECON. DYNAMICS 205–20 (1994); Stan J. Liebowitz & Stephen E. Margolis, Path Dependence, Lock-in and History, 11 J. L. ECON. & ORG. 205–26 (1995); Stan J. Liebowitz & Stephen E. Margolis, The Fable of the Keys, 33 J. L. ECON. 1–25 (1990). 26 See in that respect: Duncan J. Watts & Steven H. Strogatz, Collective dynamics of “small-world” networks, 393 NATURE, 440 - 442 (1998); DUNCAN J. WATTS, SMALL WORLDS (1999); SERGEY N. DOROGOVTSEV & JOSE FERNANDO MENDES, EVOLUTION OF NETWORKS: FROM BIOLOGICAL NETS TO THE INTERNET AND WWW (2003); M NEWMAN, NETWORKS. AN INTRODUCTION 552-565 (2010). 27 A comprehensive analysis of “preferential attachment” is found in M NEWMAN, NETWORKS. AN INTRODUCTION 487533 (2010). SEE ALSO: Sergey N. Dorogovtsev, Jose Fernando Mendes, A. N. Samukhin, Structure of growing networks with preferential linking, 85 PHYS. REV. LETT., 4633-4636 (2000); Albert Laszlo Barabasi - Reka Albert, Emergence 25

15

attachment concept has also been supplemented by a “fitness concept” under which certain nodes have a higher likelihood of acquiring new links in an evolutionary process in which there is a selection process leading to a relatively higher fitness of certain nodes compared to others28. These approaches developed by network science provide statistical models that generally predict the likelihood of distribution of links between nodes but do not explicitly address the issue of bounded or constrained intentionality, an issue that is typically found legal systems where evolution intervenes through a deliberate addition by agents of new rules or the elimination of existing ones, actions that are constrained by rules of production. These network science approaches however have the peculiar capability of indicating how one should map the actual network growth of the legal system, in so far as they suggest that one should look at the modifications of the nodes of the network (the rules of the legal system). The evolution of legal systems is therefore defined here in network terms by considering the change over time of the nodes (the rules) of the directed acyclic network that is the legal system represented as a tree diagram, rather than by looking at the enactment or abrogation of general norms. There are two types of this evolution: (i) regulative evolution and (ii) network evolution. The former type of evolution involves the actual regulation of individual conduct, and thus is related to the modification of Rules of the Case that individually affect the behaviour of recipients. By contrast, the latter type of evolution involves the articulation of the network of those rules that eventually lead to the regulation of individual behaviour by agents through Rules of the Case (courts, administrative agencies and self-compliant individuals), and thus is related to the modification of rules that do not individually affect the behavior of recipients. The rules involved in network evolution are: (i) general primary rules, which regulate ex ante classes of individual situations and are only susceptible to be individualized in Rules of the Case by agents; (ii) general rules of production, which regulate ex ante classes of individual acts of production of primary rules by agents and thus do not immediately extend to regulate individual conducts; and (iii) singular rules of production, which regulate individual acts of production by agents of other rules but do not extend to regulate individual conducts. As a consequence there is regulative evolution when Rules of the Case (terminal nodes) are added or eliminated, whereas there is network evolution when all other kinds or rules encompassed by the model presented here (intermediate nodes) are added or eliminated. As all kinds or rules can be either added or eliminated, one can envisage four types of network evolution of legal systems (Table 1 below): 1. regulative additive evolution: new Rules of the Case (terminal nodes) are added; 2. regulative diminutive evolution: existing Rules of the Case (terminal nodes) are eliminated; 3. network additive evolution: new general primary rules and new singular/general rules of production (intermediate nodes) are added; of scaling in random networks, 286 SCIENCE, 509-512 (2002). For evidence of preferential attachment in various kinds of networks: Reka Albert, Hawoong Jeong, Albert Laszlo Barabasi, Measuring preferential attachment in evolving networks, 61 (4) EUROPHYSICS LETTERS 567-572 (2003); Mark E. J. Newman, Clustering and preferential attachment in growing networks, E 64 PHYSICAL REVIEW 025102 (2001); Ginestra Bianconi & Albert Laszlo Barabási, Competition and Multiscaling in Evolving Networks, 54 EUROPHYSICS LETTERS 436-444 (2001). 28 Ginestra Bianconi & Albert Laszlo Barabási, Bose-Einstein condensation in complex networks, 86 PHYS. REV. LETT. 5632-5635 (2001). The concept of preferential attachment is also connected with “network effects” which explain how certain features are retained in a form of path dependent development: Stan J. Liebowitz, Stephen E. Margolis, Network Externalities (Effects), in THE NEW PALGRAVE'S DICTIONARY OF ECONOMICS AND THE LAW (1998); Sushil Bikhchandani, David Hirshleifer & Ivo Welch, A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades, 100 J. POL. ECON., 992–1026 (1992); Stan J. Liebowitz & Stephen E. Margolis, Market Processes and the Selection of Standards, 9 HARV. J. L. & TECH., 283–318 (1996); Stan J. Liebowitz & Stephen E. Margolis, Are Network Externalities a New Source of Market Failure?, 17 RES. LAW & ECON. 1–22 (1995); Stan J. Liebowitz & Stephen E. Margolis, Network Externality: An Uncommon Tragedy, 8 J. ECON. PERSP, 133–50 (1994).

16

4. network diminutive evolution: existing general primary rules and existing general rules of production (intermediate nodes) are eliminated29. EVOLUTION NETWORK PROPERTY

ADDITIVE

DIMINUTIVE

REGULATIVE

NETWORK

Modification of terminal nodes

Modification nodes

Type 1 New Rules of the Case are added

Type 3 New general primary rules and new singular/general rules of production are added

of

intermediate

Type 4 Type 2 Existing Rules of the Case are Existing general primary rules and existing general rules of eliminated production are eliminated

Table 1 – Types of evolution of legal systems In regulative (additive or diminutive) evolution the number of regulated individual situations expands or decreases depending on the net balance of Rules of the Cases added and eliminated. While regulative additive evolution encompasses all endemic forms of creation of Rules of the Case, regulative diminutive evolution occurs in a more limited fashion when administrative decisions are revoked/annulled or judicial decisions appealed/overturned, a phenomenon that is statistically not relevant if compared to additive evolution. As a result Rules of the Case tend to increase in number, thereby triggering normative complexity (proliferation of singular rules of production and of Rules of the Case). Network additive evolution is also an important dimension that should not be overlooked. In a network approach normative complexity grows as a result of the creation of new general rules of production that potentially create singular rules of production, which in turn are instrumental in the enactment of new Rules of the Case. Normative complexity also escalates as a result of network additive evolution that occurs through the creation of new general primary rules that identify new classes of individual situations which are potentially subject to be individually regulated by Rules of the Case to be enacted by agents. As a result legislation (general primary rules) tends to increase, eventually engendering normative complexity (proliferation of singular rules of production and of Rules of the Case) through the action of atomized agents. By contrast, when network diminutive evolution occurs, then normative complexity is lowered because the elimination of general rules of production tends to bring down the likelihood that new Rules of the Case be enacted by agents, and the elimination of general primary rules reduces the number of situations which are potentially subject to be individually regulated by Rules of the Case enacted by agents. In principle network additive evolution tends to exceed network diminutive evolution, and therefore the trend is toward an increase of normative complexity. All in all additive (regulative and network) evolution tends to exceed diminutive (regulative and network) evolution, with the result that the increase of normative complexity appears to be an 29

Singular rules of production cannot be eliminated: they are self-addressed to the agents that enact them (nomopoiesis), and therefore there is no way to explicitly repeal those rules once they are enacted.

17

entropic property of the legal network that evolves through a kind of “network growth” , i.e. a network property that involves that expansion of nodes and links of the legal network. An additional network property of evolution of legal systems that is specifically based on their scalefree nature is that the modifications of big connectors (nodes) have a greater impact than modifications of other nodes. As the connectors yield a very high number of production links leading to singular rules of production and Rules of the Case, while most of the rules yield a limited number of production links leading to singular rules, a modification of a connector (for example the repeal of a statute which is massively enforced or self-complied, or the overruling of a leading case) has a direct impact on the creation of Rules of the Case and is more relevant in network terms than the change of a node which is not a connector30.

6. A case for the application of network theory to the U.S. tax system Network science is intrinsically an empirical science, as the scope of a network study is invariably defined by a set of research questions (essentially what are the nodes and what type of links are considered) that guide the collection of data which is developed to model the system. Indeed network science has been initially advanced by studying readily observable systems - such as the connections between individuals in small social groups31 - asking questions about certain apparent connections, and then has extended to much more complex systems where nodes and links are not immediately apparent and are massive in their number, by using specific tools to identify them. For example computer programs (called crawlers) now can surf the Web to find pages (nodes) and links and enable researchers to create complex network representations based on statistical principles, or social scientists elaborate with statistical methods detailed data collected through questionnaires to pinpoint actors and types of relationships in complex social contexts, and so on. Legal systems viewed as networks of rules (nodes) that evolve continuously are no doubt very complex objects, and yet the task of actually identifying the rules to model these networks and their change can be accomplished because legal data are stored in electronic databases. The consequence is that the material collection of legal data, although it might appear to be daunting, is similar to that conducted in other fields (for example citation networks or the Web) and is therefore feasible if discerning guidelines guiding such collection are drawn from selective models. Since in the legal field data are available and with great abundance, the pivotal issue is then what type of links between rules (nodes) should be considered to guide the collection of such data. This paper has clarified that production links are the essential relationships between rules that should be included in a model of the legal system, and this last section briefly proposes an actual empirical research that could be conducted by considering the U.S. tax system as a potential case study of a legal network based on the concept of production links. The U.S. tax system is suited for these kind of application of network thinking, as tax rules are contained in a hierarchically ordered corpus composed by the Internal Revenue Code (the “IRC”) and the Regulations (the “Regs”). In the U.S. a complex net of binding rules is in place, with the result that after the IRC, the most important source of tax law is the Regulations promulgated by the 30

Fowler for example has evidenced that the cases of the Supreme Court in the United States that overrule previous cases increase their relevance because lead to an increased number of subsequent cases of the Supreme Court itself: James H. Fowler, Sangick Jeon, The Authority of Supreme Court Precedent: A Network Analysis, http://jhfowler.ucdavis.edu/authority_of_supreme_court_precedent.pdf. 31 Consider for example the pioneering work on the so called “small world properties” of networks according to which a relatively small number of links between nodes is needed to connect any two nodes in the network: Stanley Milgram, The Small World Problem, PHYSIOLOGY TODAY 60-67 (1967); Mark Granovetter, The Strength of Weak Ties 78 AMERICAN JOURNAL OF SOCIOLOGY 1360-80 (1973). For applications of small world properties in network science: Reka Albert, Hawoong Jeong, Albert Laszlo Barabasi, Diameter of the World Wide Web, 401 NATURE 130131 (1999); MARK E. J. NEWMAN, NETWORKS.AN INTRODUCTION 552-565 (2010).

18

Treasury Department. The Code sections are numbered and the Regulations have a prefix “1” and the number following the prefix is the number of the section of the Code under consideration; this clearly indicates a hierarchy between IRC and the Regs and enables the identification of the Rules of the Case that are the object of investigation32. Most of current research on networks has been developed relying on data available on the Internet or in other databases; likewise it is possible to collect empirical data on the U.S. tax system by using existing database information (such as Westlaw or Nexis) relating to specific numbered Sections (§) of the IRC and Regs in order to retrieve the cases relating to such Sections. A practical concept of “rule” should be used here as an actual set of regulatory arrangements concerning a specific tax policy issue, for example a specific Section (or part of a Section) under any Part or Subpart of the IRC. The applications of the network approach to the U.S. tax system envisaged here can be summarized in few main areas: 1) network measurement of the structure of the U.S. tax system; 2) network measurement of legal connectors within the IRC and Regs; 3) analysis of complex behaviour and self-organization of clusters of tax Rules of the Case; 4) tax design and policy issues. The first kind of practical applications of network theory involves the structural analysis of segments of the IRC. This can be relevant, for example in devising legislative changes, or for general purposes of maintenance of the normative organization of the Code. By network analysis I mean here the analysis of production links between rules. One set of research questions posed by this network analysis are those relating to the level of proliferation of singular rules of production (see above section 2 on normative complexity) within segments of the IRC. This analysis can be carried out by looking at how long are the chains of productions made by rules of production eventually leading to Rules of the Case, as well as by asking whether all these rules of production are necessary and/or appropriate. A second set of research questions is related to general primary rules, i.e. those rules of the IRC and Regs that lead to the creation of Rules of the Case through production links (decisions rendered by IRS agencies and tax courts, self-compliance rules enacted by individual taxpayers). This measurement is aimed at defining the scope of combined proliferation of singular rules of production and Rules of the Case, which indicates the level of normative complexity of the IRC. The second kind of practical applications of network theory to the U.S. tax system is the measurement of legal connectors within the IRC and the Regs, and this implies the analysis of enforcement and self-compliance of specific rules of IRC and Regs33. In network terms, the measurement of enforcement of rules of IRC and Regs is the number of individual applications of these rule by IRS agencies or tax courts that lead to Rules of the Case either in the form of tax audits or tax cases when the tax audits are litigated. While tax cases applying specific rules of the 32

Local tax doctrine and case law has clarified in details how the IRC and Regulations interact and this has to do also with the validity problem. The legislative regulations are given “controlling weight unless they are arbitrary, capricious, or manifestly contrary to the statute” (National Muffler Dealers Ass’n v. U.S.), and “only if the code has a meaning that is clear, unambiguous and in conflict with a regulation does a court have the authority to reject the Commissioner’s reasoned interpretation and invalidate the regulation” (Redlark v. Commissioner). Interpretive regulations are entitled to great weight, but may be held invalid by a court when they are inconsistent with the statutory language or when there is no statutory to support the regulatory provision. When a statute may reasonably be construed to have any one of several alternative meanings, the selection of one such meanings in the Regulations will be conclusive even if one or more of the alternative constructions were preferable (Fulman v. U.S.). A court will invalidate a Regulation only if its construction is “unreasonable and inconsistent with the language, history and purpose of the statute” (Estate of Bullard v. Commissioner). 33 On tax enforcement: Louis Kaplow, Optimal Taxation With Costly Enforcement And Evasion, J. PUB. ECON. 221 (1990).

19

IRC are published and available on databases, individual audits are not. Therefore, to develop a network description of segments of the IRC it would be necessary to collect also data on applications through Rules of the Case (decisions by IRS agencies) of specific rules of the IRC. That could be done internally at IRS level where data on audits are available. In practice, in network terms, the research question would be how many Rules of the Case are produced as individual applications of a specific general primary tax rule of the IRC and Regs and this would be measured by constructing networks of links between these primary rules and the related Rules of the Case. Each application of a general primary rule of the IRC and Regs leading to a Rule of the Case either in the form of tax audits or tax cases when the tax audits are litigated is a production link, and when a single general primary rule leads to many production links (i.e. to many Rules of the Case), that general primary rule is a legal connector linked to tax audits or tax cases when the tax audits are litigated.. Therefore this kind of analysis would be aimed at finding whether specific rules of the IRC and Regs are legal connectors relevant in network terms. This assessment should have an impact in terms of tax reform, as the repeal of a connector which may be found within the IRC has a greater impact than the repeal of other rules. As application of general primary rules of IRC and Regs can be a result of enforcement by the IRS, but also a result of litigation ensuing from tax audits, network analysis should also be based on the production links that a general primary rule of the IRC and Regs can generate in terms of cases decided as a result of litigation. In practical terms the network approach should also address the question as to how many audited cases are then litigated within segments of the U.S. tax system. Direct enforcement and litigation are also connected to self-compliance, considering that selfcompliance Rules of the Case are those found in duly filed tax returns that are not challenged by IRS and/or litigated. The tax returns that are not audited or litigated constitute applications of general primary rules of IRC and Regs and therefore are Rules of the Case in network terms, and specifically these are self-compliance Rules of the Case. Those rules are neither published (like tax cases) nor traceable (like tax audits), but take the form of tax returns (specifically those parts of tax returns which are self-applications by taxpayers of specific rules of IRC and Regs). Hence it is possible to have an approximate measure the production links of a general primary rule that leads to self-compliance Rules of the Case by adding to the number of tax returns audited by the IRS (decisions) the number of tax returns not audited by the IRS (filed tax returns, the vast majority of cases). When the analysis of legal connectors also encompasses self-compliance rules, a likely result of network approach to tax enforcement would then be that those provisions of the IRC and Regs that are widely self-complied with and that do not trigger audits or litigation operate as connectors within the network, according to power law distributions. In conclusion the measurement of Rules of the Case resulting from direct enforcement by IRS, litigation and self-compliance may identify the rules of the IRC and Regs that generate the highest number of production links, thereby operating as legal connectors. Furthermore, by associating costs to each production link (administrative costs and resources for enforcement, litigation costs for litigated cases, compliance costs for self-compliance by taxpayers) it would be possible to gauge segments of the IRC in terms of distributed costs along the network; for example an IRC rule that is heavily audited would be associated with costs borne by the IRS (production links connecting such a rule with each IRS decision), while an IRC rule that is widely self-enforced would be associated with compliance costs that are shifted to individual taxpayers. The third kind of practical applications of network theory to the U.S. tax system is the analysis of complex behavior and self-organization of clusters of tax Rules of the Case. This kind of application of network theory has been already proved relevant in the analysis of evolution of 20

common law cases, and research has already been conducted in this area by looking at the citations among Supreme Court case. This research could be further improved by looking at the retention by subsequent tax cases of the holding of previous case/cases (see Fig. 3 above) in respect to specific issues within the IRC. This would imply, for example, the analysis of production links between precedent and subsequent tax cases in specific areas and could lead to measures of evolutionary fitness of lineages of cases. A direct measure of the evolutionary fitness of a case is the number of production links generated by such case because, as noted above at Section 3 discussing clustering coefficients among common law cases, every time a precedent case is binding on a subsequent case there is a production link between the precedent case and the subsequent case. The mapping of links among tax cases (Rules of the Case), thus should identify which cases are connectors, (i.e. which are the cases that have a higher number of links than other cases), as well measure the clustering coefficient of groups of cases. A similar kind of analysis could also be developed in respect to enforcement policies by looking at the production links connecting previous decisions to subsequent decisions taken by IRS agencies and this could lead to guidelines with reference to cases that should be litigated, settled or dismissed by the IRC.. The fourth kind of practical applications of network theory to the U.S. tax system are those relating to tax design issues which involve the impact of change of rules on segments of the IRC represented as a network of rules. Tax design involves varying combinations of different judicial and administrative measures that often imply the enactment of rules of production as well as of Rules of the Cases as a viable alternative to the outright formal enactment of legislation. A change of the U.S. tax system viewed from a network perspective can thus in theory occur through policy arrangements that vary along a continuum from a “top-down pattern” to a “bottom-up pattern” of tax design, a summarized below at Table 2. In a “top-down pattern” a change of the U.S. tax system would be introduced either (i) by amendments to the IRC bound to be subsequently interpreted and implemented by IRS guidelines and/or case law, or (ii) by guidelines issued by the IRS in conjunction with case law without recourse to legislation. In a “bottom-up pattern” a change of the U.S. tax system would be triggered by case law. A summarized view of the possible types of change of the U.S. tax system and their impact in network terms is presented at Table 2 below, where each type of tax change is considered in respect to various factors: the chain of production that underlies the intended changes (first column), the type of additive evolution of the tax system triggered by each type of tax design combination (second column), and the type of policy pattern of the selected tax design combination (third column). Table 2 shows that tax change can be implemented through amendments to IRC and Regs, through IRS agencies or can occur spontaneously through case law. Table 2 also shows the possible impact of legal change on the network structure of the U.S. tax system by looking at the various types of rules of the legal systems described at section 2 of the paper34.

34

General primary rules, singular primary rules (Rules of the Cases), general rules of production and singular rules of production ; see at Table 1 above.

21

CHAINS OF PRODUCTION CHANGE IMPLEMENTED THROUGH IRC REGS

ƒ

Amendements to IRC and directly regulating behavior

AND

TYPE OF ADDITIVE POLICY EVOLUTION PATTERN Top-down Regs Network evolution ƒ New general (legislative discretionary primary rules powers)

combined with subsequent

CHANGE IMPLEMENTED THROUGH AGENCIES

Regulative evolution New Rules of the Case

ƒ ƒ ƒ

Audits by IRS agencies Decisions by tax courts Private self-compliance

ƒ

Amendements to IRC and Regs that Network evolution distribute normative powers among ƒ New general/singular agencies rules of production Decisions by IRS agencies regulating Regulative evolution individual situations New Rules of the Case

IRS ƒ

CHANGE WHICH OCCURS THROUGH ƒ CASE LAW

Decisions by tax courts

Top-down (administrative discretionary powers)

Network evolution Bottom-up ƒ New singular rules (judicial of production discretionary powers and Regulative evolution binding precedent New Rules of the Case in common law systems)

TABLE 3 – Network impact of change in the U.S. tax system.

The first kind of modification in network terms of the U.S. tax system is that in which the change would be implemented through amendments to IRC and Regs. This is a fairly standard “top-down pattern” of tax design in which a tax policy issue is solved by statutes and Regs which may be subsequently implemented by decisions by IRS agencies regulating individual situations and/or case law. The amendments to IRC and Regs lead to new Rules of the Case regulating individual situations and therefore they trigger both network additive evolution (new general rules of production) and regulative additive evolution (new Rules of the Case). Therefore empirical analysis would also look at the multiple sets of Rules of the Case (decisions by IRS agencies, case law, or self-compliance rules by individuals) within segments of the IRC that are the outcome of the IRC changes. The impact of the legislative change therefore should be assessed in respect of these Rules of the Case generated within the U.S. tax system with regard to their enforcement costs. For example an IRC section with a low level of spontaneous self-compliance by recipients may trigger high enforcement through Rules of the Cases issued by IRS (or possibly by courts in ensuing litigation) and therefore may imply significant enforcement costs. The second kind of possible modification of the U.S. tax system is that in which the change would be implemented through IRS agencies, i.e. decisions taken by those IRS agencies that regulate ex post individual situations (basically tax audits). This is also a “top-down pattern” that is widely used which is however more flexible than simple legislative change, as it usually operates through the exercise of administrative discretionary powers that apply existing statutes of regulations (general primary rules), rather than through the enactment of new detailed statutory rules or regulations. 22

This kind of tax change could also be prompted by legislation that re-distributes normative powers of IRS agencies. The result of the tax change in these cases is therefore a reorganization of segments of the U.S. tax system through a replacement of general rules of production resulting in a different distribution of lineages of singular rules of production and of Rules of the Case. This kind of tax design approach triggers network additive evolution as it entails the creation of new general and singular rules of production, as well as regulative additive evolution as it entails the creation of new Rules of the Case (individual tax audits and tax cases when the audits are litigated). These kinds of change increases the normative complexity of the U.S. tax system as new chains of production are added generating (potentially unlimited) multiple sets of subsequent Rules of the Case, and they occur when the adopted policy solution requires procedural arrangements to ensure compliance, for example the setting up of proper offices and the monitoring of the rulings issued over time. The impact of these changes of the U.S. tax system should therefore be assessed in respect to the increased normative complexity created by proliferation of singular rules of production and Rules of the Case. The third kind of modification of the U.S. tax system is that in which the change occurs through case law, i.e. decisions by tax courts. This kind of change constitutes a “bottom-up” pattern in which policy-makers (legislators or IRS agencies) decide to leave certain issues open to adjudication, for example there can be a decision to leave to tax courts the definition of standards or the scope of general clauses. This kind of tax design approach triggers network additive evolution in so far as it entails the creation of new singular rules of production by agents (the courts) relying on binding precedents, as well as regulative additive evolution as it entails the creation of new Rules of the Case (individual tax cases). This process has an unpredictable he outcome, but it may achieve a satisfactory level of stability over time because the operation of self-organization of rules. In a common law system, in particular, lineages of case have a kind of self-organization and complex behavior which is caused by the high clustering coefficient among them (see section 4 above).

23

NAiL2014

Author Index

Author Index

Agnoloni, Tommaso Altamura, Antonio Ashley, Kevin Barros-Platiau, Ana Flavia Bod, Rens Boer, Alexander Boulet, Romain

3 20 1 6 5 8 6

Dari Mattiacci, Giuseppe De Felice, Deborah

5 21

Faro, Sebastiano Ferrell Bjerke, Elizabeth Fitzgerald, John Fiumara, Giacomo

20 1 7 4

Garbarino, Carlo Ginsburg, Tom Giura, Giuseppe Grabmair, Matthias Guclu, Hasan

32 5 21 1 1

Karstens, Bart Koolen, Marijn

5 5

Lettieri, Nicola

2, 20

Malandrino, Delfina Mazzega, Pierre

2 6

Pagallo, Ugo Pareschi, Remo Potter, Margaret

3 9 1

Rinaldi, Gabriele

4

Savelka, Jaromir Sijtsma, Bas

1 8

Tamme, T˜ onu Toffoletto, Franco T¨ aks, Ermo

19 9 19

1

NAiL2014

Author Index

Verendel, Vilhelm Vicidomini, Luca V˜ ohandu, Leo

21 2, 20 19

Winkels, Radboud

22

Zica, Paolino

2

9