Artificial intelligence and data-mining as regulated by the new Copyright Directive
For those enthusiasts in this area of copyright or for those who just want to get out of the current news bubble, I prepared few lines to explain the link between artificial intelligence and data-mining as regulated by the new European Directive on copyright in the digital single market. All this with an introduction that should prezent the concept of artificial intelligence in a less sophisticated way. That is, without addressing algorithms or going into the details of certain technologies used by AI systems.
I must admit that all the research that I have undertaken lately started with a parallel study of these two concepts and I noticed the connection between them only after I dived into the technologies used for the development of artificial intelligence systems and realized that their essence lies not in the way they perform or in the data on which they operate, but in the analysis and extraction of knowledge that they operate or rely on. Analysis and extraction of knowledge as indicators of intelligence. Or, rather, intelligence as a result of a process of analysis, extraction and application of knowledge. Given these issues, it could be easier to evaluate an artificial intelligence system, including interpreting the new exceptions introduced by the European copyright reform.
Of course, there are multiple definitions of artificial intelligence (I even selected several versions along the way, mostly because I like to see those different perspectives and to compare them), but the central idea is that the system, beyond the necessity of understanding the way it works (with its specific algorithms), it treats the above-mentioned concept, namely — intelligence. There are, in this case too, multiple ways of understanding or (even) recognize intelligence and how it can manifest itself. Similar to the differentiated way in which human intelligence is perceived, as a singular attribute (C. Spearman) or manifested, on the contrary, through multiple forms (Howard Gardner’s multiple intelligences or the analytical, creative and practical triad that Robert Sternberg spoke of), intelligence artificial represents, in fact, distinct abilities to recognise problems, to analyse and to solve them.
Therefore, an artificial intelligent system will not only be the one that can render all human abilities at once, but the one where only one ability is developed and represented at a certain level, as we explain below. This aspect also describes the current state of developments in the field of AI.
Corea, for example, has an interesting study that explains the concept of artificial intelligence from the perspective of specific areas, namely the problems for which it has been used, together with the approaches and technologies involved. Through this research, that resulted in a map, which I recommend as a very interesting AI learning tool, Corea makes a first classification between weak / narrow AI, strong / general AI and ASI — (artificial super intelligence), saying that it is important to understand that a system that possesses all the abilities is for the moment only a wish, the current artificial intelligence being limited to a “set of technologies that are incapable of doing something outside their scope”.
As you can see on the vertical axis of the map created by Corea, artificial intelligence is designated to specific domains and identifies abilities such as perception, knowledge, reasoning, organization, communication as well as creativity and movement (even if the latter do not appear on the map above) and it is important to realize that even if technologies are complementary, they could be allocated to solve different domains. I don’t want to be understood that certain technologies are specific only to certain areas, but it is easier and more useful to perceive artificial intelligence from the perspective of each particular domain. As we have pointed out above, specific to AI systems is intelligence, that is, the ability to analyse, extract knowledge from this analysis and transpose / apply it to resolve different problems and this ability can be identified in multiple forms of algorithmic functioning. A system capable of communicating in the Turing (bot) style, for example, will not be less intelligent because it is unable to move.
Beside that, the domains identify problems that can be solved in various ways and there is no system that works out all forms of communication, for ex, because it should be noted that beyond the classical verbal exchange, this concept also describes other ways in which information is transmitted from an environment to another / from one system to another, using different symbols and based on rules other than those of verbal communication (we can think for example of the transfer of information by electromagnetic radiation or by other electronic signals). Nor should the intelligent aspect of communication be ignored, a system that transmits information must be able to listen, record, analyze, evaluate the environment in a certain way — that is, the whole information it interacts with — in order to act as a receiver as well as the recipient of certain types of communications. The knowledge we were talking about at the beginning are those results of environmental analysis and evaluation that help the system perform independently or relatively independently. And applying this knowledge is another indicator of intelligence, which helps the system in the evolutionary process.
Because we spoke at the beginning of the multiple ways of defining artificial intelligence, it should be mentioned that this concept of intelligence should not be perceived only from the perspective of human capabilities, multiple other forms of existence manifesting intelligence and, in fact, inspiring the development of AI systems as those of the natural computation sector that studies and uses bio communication, for example, as a way of transmitting information between different species of plants and animals.
While writing this I realize that one way to illustrate this would be to actually give as much examples as I could to better understand how artificial intelligence works for each domain. I’ll try to do his next time in a more detailed article, right now I want to return to the main subject by trying to explain why is artificial intelligence so important in the context of the new data-mining exceptions.
I hope that I have not lost any legal practitioners up to now because the above explanations are really important for a precise interpretation of the new exceptions. Specifically, all the above analysis explains why the provisions of the new copyright Directive should not be interpreted to regulate only the activity of data mining but any type of technology that involves the analysis and extraction of knowledge. If the official text will be carefully examined, the legislator himself expresses in a general manner, without any limitation regarding the type of data processing, data-mining being, in other words, just an umbrella term covering any automated method of analysis and extraction of knowledge:
“text and data mining means any automated analytical technique aimed at analysing text and data in digital form in order to generate information which includes but is not limited to patterns, trends and correlations;”
There are quite a lot of legal professionals who have appreciated the new texts of the directive, by analysing the scope of it only from the technical perspective of the stages in which this specific processing method (data-mining) is carried out. However, a study drafted in 2014 by the deWolf partners at the request of the European Commission, recommended the use of the terminology of “data analysis” instead of “data mining” precisely to avoid confusion and not to mislead in terms of scope. Another study, in which one could find the reference to recital 8 of the Directive, stated that data mining terminology is, in fact, a term used “to refer to a variety of analytical tools normally based on the use of digital technologies, big data and the Internet ”. Unfortunately, these aspects were not detailed and even in the context of the references to the texts of the directive, the generic nature of data mining was too little tackled, the public being exclusively focused on the technique known as “data mining”. This approach would not be wrong either, since the technique itself can be found in many of the developments in the AI area (if we could name only two of them, namely the projects run by Obvious Art and Sony CLS Research lab). However, limiting the scope of the exception to a single technology is a wrong approach, which does not take into account the multitude of technologies that can be integrated to the same extent in what it means by “automated analytical knowledge extraction technique”.
In fact, if we try a technical exploration we’ll see that what happens in Europe on a legislative level describes, in fact, a sort of custom in the tech industry, there being many that put the equal sign between “data mining” and the whole process of knowledge discovery. This approach is not wrong, it is just a perspective from which mining is more than a technique in itself but a process, a complex of stages found under a certain form in multiple other technologies that require an examination of the data for the discovery of certain information.
But the analysis process itself may seem to be another barrier in understanding and approaching artificial intelligence systems because patterns, correlations or associations are concepts that seem difficult to assimilate. But, they just seem unapproachable mainly because we try to understand them only in a technological context and we do not take into account that these methods can be found in any type of human analysis.
Conscious or not, we transit daily through a conglomerate of data ( the information we interact with), we process that information, we visualize it or, in another way, we assimilate it for interpretation (because each information needs to be understood, solved), we associate it with information already known, we select it in different categories, frequently exercising the option to keep the essentials or to delete older data in order to accumulate new ones, we make multiple associations between them every second we see some patterns involved and we coordinate our own learning possibilities by incorporating and applying the results of this continuous analyse.
The human brain’s processing explains the way computer is processing data. Computers analyse information and learn from this analysis just like humans do. Computers can be intelligent.
- Communication from the Commission to the European Parliament, the European Council, the Council, the European Economic and Social Committee and the Committee of the Regions on Artificial Intelligence for Europe;
- Thomas Margoni and Martin Kretschmer — “The Text and Data Mining exception in the Proposal for a Directive on Copyright in the Digital Single Market: Why it is not what EU copyright law needs”;
- John McCarthy, Marvin L. Minsky, Nathaniel Rochester and Claude E. Shannon — “A PROPOSAL FOR THE DARTMOUTH SUMMER RESEARCH PROJECT ON ARTIFICIAL INTELLIGENCE”;
- Francesco Corea — “AI Knowledge Map: how to classify AI technologies”;
- Chethan Kumar GN — “Artificial Intelligence: Definition, Types, Examples, Technologies”;
- Robert J. Sternberg — “Toward a triarchic theory of human intelligence”; “Beyond IQ: A triarchic theory of human intelligence”;