Researchers have created an AI capable of learning from scientific literature through unsupervised learning.
Here, the researchers created a system that could identify and extract information independently, using sophisticated techniques based on statistical and geometrical properties of data to identify chemical names, concepts and structures.
For example, the machine learning program can classify words in the data based on features, such as "elements", "energetics" and "binders". In this case "heat" can be classified as part of "energetics", and "gas" to be part of "elements".
This AI ability helps the computer to connect certain compounds to get proper understanding of their properties, and gain insight into how the words were connected with no human intervention required.
The researchers conducted the test on about 1.5 million abstracts of scientific papers on material science.
In this test, the researchers programmed the AI to identify a substance known as CsAgGa2Se4 as a thermoelectric material.

word thermoelectric.
The research that led to the discovery of CsAgGa2Se4 can be traced back to 2009. It took around 3 years for scientists to discover the compound (2012).
The AI here, learned from those research paper, and concluded that CsAgGa2Se4 is a thermoelectric material, in fractions of the time.
In other words, if the AI had been around in 2009, it could have speeded up the discovery.
The result showed that AIs can be programmed to recommend materials for functional applications several years before their actual discovery. This has remarkable implications, considering that most existing Natural Language Processing (NLP) methods used by AIs are supervised, thus requiring labor-intensive input from humans.
For long, humans rely on ingenuity that is said to be driven by passion and intuition. This helps us in making discoveries, ranging from medicines to fundamental practices.
Computers lack humans' passion and intuition. But since ingenuity can come from logic, something that computers excel, computers too can have "ingenuity".
The paper from the researchers shows that AI can manage to predict future scientific discoveries by simply extracting meaningful data from research publications.
The idea is by leveraging language processing as the stepping stone.
Language has been part of humans' connection with thinking, and it has shaped human societies, relationships and, ultimately, intelligence. Therefore, it is not surprising that AI researches are aimed to extract the full potential of human language, and putting that into computers.
Natural Language Processing (NLP) is part of machine learning. This technology aims to make computers capable of assessing, extracting and evaluating information from textual data.
This method in the future, could help computers understand the complex relationship between different layers of information, in ways much better than humans.
It can also be developed to make AIs capable of processing and understanding complex information that is impossible for humans to carry out.
It made the prediction by connecting the compound with words such as 'chalcogenide' (material containing 'chalcogen elements' such as sulfur or selenium), 'optoelectronic' (electronic devices that source, detect and control light) and 'photovoltaic applications'.
With the AI capable of knowing that many thermoelectric materials share such properties, it was able to conclude the prediction and help the discovery.
And with AIs can be programmed to learn from scientific papers unsupervised, the possibility for this in the future is bright.