The purpose of this 750-1000-Word Expanded Definition is to explore the definition of the term “machine learning” with regards to the scientific community and society. I will be analyzing the term in a study on fairness, education, and machine translation. My working definition will be provided afterwards.
In the article “A Snapshot of the Frontiers of Fairness in Machine Learning” by Alexandra Chouldechova and Aaron Roth, the definition of machine learning is straightforward. “Machine learning is no longer just the engine behind ad placements and spam filters; it is now used to filter loan applicants, deploy police officers, and inform bail and parole decisions, among other things.” (Chouldechova, Roth, 2020, p.82). To Chouldechova and Roth, machine learning is a process that has evolved to automate more complex data.
On the other hand, the New York Times article “The Machines Are Learning, and So Are the Students” by Craig S. Smith defines the term differently. “Machine-learning-powered systems not only track students’ progress, spot weaknesses and deliver content according to their needs, but will soon incorporate humanlike interfaces that students will be able to converse with as they would a teacher.”(Smith, 2019). According to this, it can be seen that Smith defines machine learning as a means to an end, and this end being helping students learn better.
As for the article “On the features of translationese” by Vered Volansky, Noam Ordan, and Shuly Wintner, machine learning is a simple one. “In supervised machine-learning, a classifier is trained on labeled examples the classification of which is known a priori. The current task is a binary one, namely there are only two classes: O and T.”(Volansky, Ordan, Wintner, 2015, p. 103). To Volansky et al., machine learning is an assisting tool to help create more humanlike translation and must be supervised in order to function correctly.
The context of all three articles is quite simple. The quotes I have used above are the most relevant to the topic of choice, as well as definition since machine learning wasn’t clearly defined in each article. So I will bounce off of that.
In the first article, the context used is a scholarly article searching into how machine learning can be made “fair”, or better put, “objective”. “With a few exceptions, the vast majority of work to date on fairness in machine learning has focused on the task of batch classification.”(Chouldechova, Roth, 2020, p.84). For better or for worse, the quote tells us that fairness has typically through batch classification. Batch classification, in this context, is sorting data by inputting user-defined characteristics and then judged through user-defined fairness. Machine learning is just the process of automating this process and even “learning” how to do it with other types of data. But the fallacy of fairness with such a method is laughable since humans are the ones defining fairness. Since humans have inherent bias, fairness is difficult to judge.
In the second article, it is a news article speaking about technology in education, specifically, machine learning and how beneficial it is for teachers. “The system also gathers data over time that allows teachers to see where a class is having trouble or compare one class’s performance with another.”(Smith, 2019). For teachers, this system is a way to track a student’s progress or performance without having to personally analyze the sheet data.
For the last article, the context is the machine translation. What should come to mind when hearing the term machine translation should be famous web browser-based translation services such as Google Translate, Niutrans, Sougou, and DeepL. That’s about it.
Personally, I am majoring in Computer Systems: IT Operations track. However, I have a hobby in translation with the assistance of machine translation. So, my working definition for machine learning is “the application of gathering vast amounts of data, categorizing the data, sorting them out, and analyzing data to find out the psyche of people.” For example, if given a group of 100, the data collected must be categorized by their gender or whatever category is set. Then, the answers gathered will be sorted by correct/incorrect based on the generally accepted answer. Finally, the data is analyzed so that there are percentages of what questions were answered correctly most of the time based on the sorted category. With that, the machine has a sample of what to expect if someone of x category answers the same data collection set. Done on a macro-scale, the machine will be able to predict what a population’s answer could be.
Smith, C. S. (2019, Dec. 18). The Machines Are Learning, and So Are the Students. New York Times. https://www.nytimes.com/2019/12/18/education/artificial-intelligence-tutors-teachers.html
Volansky V., Ordan N., Wintner S. (2015). On the features of translationese. Digital Scholarship in the Humanities, 30(1), 98–118. https://doi.org/10.1093/llc/fqt031
Chouldechova, A., Roth, A. (2020). A Snapshot of the Frontiers of Fairness in Machine Learning: A group of industry, academic, and government experts convene in Philadelphia to explore the roots of algorithmic bias. Communications of the ACM, 63(5), 82–89. https://doi.org/10.1145/3376898