Evaluation of large language models.
How do different German language models differ?
Evaluation
Below, we compare different language models with one another. When making the comparison, the parameter counts of the different models should always be taken into account. The following table provides an overview of the number of parameters:
| Language model | Number of parameters | Closed / Open |
|---|---|---|
| DElefant | 6.4 B | Open |
| DElefant-MPT | 30 B | Open |
| AlephAlpha luminous-base/-control | 13 B | Closed |
| AlephAlpha luminous-extended/-control | 30 B | Closed |
| AlephAlpha luminous-supreme/-control | 70 B | Closed |
For AlephAlpha, it should be noted that an adapted version for more complex relationships is available under the name control. For interaction with the AlephAlpha models, we use the Playground-Complete and the prompt templateQ:{frage} A: , as recommended by AlephAlpha. The default parameters are used.
For the evaluation, we use the following areas:
Question & Answer:
This is the largest area being examined. Here, we pose open questions from a wide range of domains, and the language model must answer the questions based on the learned training data. We start with simple questions that the models are expected to answer and progress to more complex questions in which the language models must perform sorting or weigh two different things against each other.
Closed Question & Answer:
In contrast to open questions and answers, a text is provided here from which the answers can be extracted. The language model therefore only has to understand the text and answer the question accordingly. The answer does not need to be learned from the training data.
Classification:
Assigning different text content to specific categories. The example texts contain multiple different evaluations (positive and negative), and the models are expected to correctly assign the individual text sections to the respective evaluation classes.
Linguistic / grammatical understanding:
We use questions to test whether the model possesses linguistic as well as grammatical understanding. For example, by naming individual words and having the language model construct a sentence from them, or by having the language model correct grammatical errors.
Understanding / riddles:
Here, the language model’s understanding is examined across different areas. One example is solving riddles or understanding complex relationships.
Coding:
Here we test how well the model can generate code. Although we would primarily use other language models for code generation, we nevertheless want to examine how well these models understand German instructions for code and generate code.
Conclusions
| Sprachmodell | |||||||
|---|---|---|---|---|---|---|---|
DElefant-MPT | 1 | 2 | 1 | 1 | 1 | 1 | 1 |
DElefant | 2 | 1 | 1 | 4 | 2 | 3 | 2 |
AlephAlpha luminous-extended-control | 3 | 4 | 1 | 2 | 3 | 2 | 3 |
AlephAlpha luminous-extended | 4 | 3 | 2 | 2 | 5 | 2 | 2 |
AlephAlpha luminous-base-control | 5 | 4 | 2 | 2 | 4 | 3 | 3 |
AlephAlpha luminous-base | 6 | 4 | 3 | 3 | 6 | 3 | 3 |
- Fine-tuning the models worked well and produced language models that only answer the questions and no longer generate endless texts. - MPT tuning with QLoRA worked well despite strongly fluctuating loss values. - Interaction with these adapted open-source language models is significantly easier compared to the AlephAlpha models. - The only category in which the luminous-extended-control model could keep up was closed question & answer. - For code generation, the AlephAlpha models do not seem suitable, because they too often make indentation errors and do not produce runnable code. The tokenizer seems to merge multiple spaces into one. - Open question & answer, as well as complex riddles, are still difficult for the language models to grasp. In particular, sorting by numbers is usually not possible. - We see that the dataset contains quite a lot of content along the lines of: "As an AI language model, I can ..". This is also reflected in the adapted language models.
Question & Answer
| DElefant | 6/12 | 1 |
| DElefant-MPT | 5.5/12 | 2 |
| AlephAlpha luminous-extended | 5/12 | 3 |
| AlephAlpha luminous-base | 4.5/12 | 4 |
| AlephAlpha luminous-base-control | 4.5/12 | 4 |
| AlephAlpha luminous-extended-control | 4.5/12 | 4 |
Checking technical and scientific information:
"Who invented the Transformer architecture?"
| DElefant | The Transformer architecture was invented by a team of researchers at Stanford University under the leadership of Professor Yoshua Bengio. |
| DElefant-MPT | The linguistic influence of translation becomes visible here. The Transformer architecture was invented in 2017 by the Google Brain team, consisting of the researcher and computer scientist Ashish Vaswani, Noam Shazeer, Niki Parmar, and Jakob Uszkoreit. |
| luminous-base | The company General Electric. |
| luminous-base-control | The Transformer architecture was developed by a team of engineers at Google. The Transformer architecture is a new type of computer architecture designed for natural language processing. |
| luminous-extended | The Transformer architecture was developed by Bob Kahn and Vinton Cerf. |
| luminous-extended-control | The Transformer architecture was invented by a man named Geoffrey Hinton. Hinton is a world-famous computer scientist and the inventor of the Transformer architecture. |
"How much horsepower does the BMW E36 M3 have?" - Accepted answer: 286 / 295 / 321 hp
| DElefant | As an AI language model, I do not have access to the specific performance data of the BMW E36 M3. However, the BMW M3 E36 generally has around 335 hp. |
| DElefant-MPT | The BMW E36 M3 has an output of 240 hp. |
| luminous-base | The BMW E36 M3 has an output of 306 hp. |
| luminous-base-control | The BMW E36 M3 was powered by a 3.2-liter inline six-cylinder engine. The engine produced 330 hp at 5,000 rpm and had a maximum torque of 350 Nm at 3,000 rpm. |
| luminous-extended | The BMW E36 M3 has 320 hp. |
| luminous-extended-control | The BMW E36 M3 produced 315 hp in 1995. |
"What types of bonds do alkanes have?"
| DElefant | Alkanes have covalent bonds between the carbon atoms. The carbon atoms are arranged in a linear chain, with each carbon atom connected to two neighboring carbon atoms by a double bond. The double bond between the carbon atoms creates a unique molecule called an "alkane". |
| DElefant-MPT | Alkanes have no bonds. |
| luminous-base | Alkanes have single bonds between the carbon atoms. |
| luminous-base-control | Alkanes are a group of compounds consisting of carbon atoms connected by single bonds. These bonds are sp3-hybridized, meaning that each carbon atom has three unpaired electrons in its outer electron shell. These unpaired electrons are responsible for the chemical stability and physical properties of alkanes. |
| luminous-extended | Alkanes are saturated hydrocarbons. They have no double bonds and no triple bonds. |
| luminous-extended-control | Alkanes are a type of hydrocarbon that has a single bond between its carbon atoms. |
Checking general information:
"Which animal is larger, a lion or a tiger?"
| DElefant | As an AI language model, I do not have personal beliefs or feelings, but I can tell you that lions and tigers are both large cats and share some similarities in terms of size. Both lions and tigers can weigh up to 300 pounds and stand up to 6 feet tall. However, lions have a more muscular build and may appear larger than tigers. |
| DElefant-MPT | A lion is larger than a tiger. |
| luminous-base | A lion. |
| luminous-base-control | A lion is larger than a tiger. Lions are the largest cats in the world, and tigers are the largest cats in the world. |
| luminous-extended | A lion is larger than a tiger. |
| luminous-extended-control | A lion is larger than a tiger |
"How do chameleons change their color?"
| DElefant | Chameleons can change their color by altering their skin through a process called chromatophores. Chameleons have a unique chromatophore system that allows them to change their color without changing their eyes or body temperature. Chameleons have thick, scaly skin that helps them blend into their surroundings and protects them from predators. When a chameleon feels threatened, it can quickly change its color to blend into its environment [...] |
| DElefant-MPT | Chameleons change their color through a process called pigment displacement. They have a special type of pigment in their bodies that is located in their cells. When the chameleon cells focus on a specific color, the pigments in those cells are adjusted to that color. These pigments are then displaced into the surrounding cells, resulting in a color change. The chameleon cells can also displace pigments that are a different color than the one they are focused on, resulting in a color change. This process can happen very quickly, and chameleons can change their color to blend into their environment. |
| luminous-base | They change their color by altering their pigments. |
| luminous-base-control | They change their color by using a pigment called chitin. Chitin is a protein found in the exoskeletons of insects and other animals. It is also found in the shells of cockroaches and other insects. |
| luminous-extended | Chameleons can change their color by altering the pigments in the cells of their skin. |
| luminous-extended-control | Chameleons can change their color by adjusting the amount of the red blood pigment called hemoglobin in their blood vessels. When they have more hemoglobin, they appear darker and more intensely colored. When they have less, they appear lighter and more pastel-colored. |
"How do chamelions change their color?"
How do the language models react to spelling mistakes? Note: Due to the incorrect spelling of chameleons, the language models obviously respond with strong hallucinations.
| DElefant | As an AI language model, I have no information about chameleons, but I can tell you that chameleons can change their color by changing their pigments. Chameleons have the ability to shed their skin, which means they can change their color. During shedding, the chameleon changes its color by forming a new layer of skin that has a different color than the old skin. Chameleons can also change their eye color by altering pigments in their eyes. |
| DElefant-MPT | Chamelions can change their color by absorbing or releasing pigments in their skin cells. When they move into a warm environment, they absorb pigments that produce lighter colors such as yellow or orange. When they move into a cooler environment, they release pigments that produce lighter colors such as green or blue. This color change helps them blend into their environment and avoid predators. |
| luminous-base | Chamelions can change their color by transforming into another chamelion. |
| luminous-base-control | Chamelions change their color by changing their color. They can change their color by changing their skin color. |
| luminous-extended | Chamelions can change their color by moving. |
| luminous-extended-control | Chamelions change their color by mixing with other chamelions that have the same color. When a chamelion mixes with another chamelion of the same color, the mixture retains its color. |
Checking historically relevant information:
"Which country became world champion at the 2014 FIFA World Cup?"
| DElefant | The country that won the 2014 FIFA World Cup was Germany. |
| DElefant-MPT | Brazil. |
| luminous-base | Brazil |
| luminous-base-control | Brazil won the 2014 FIFA World Cup. |
| luminous-extended | Brazil |
| luminous-extended-control | Brazil became world champion in 2014 |
"When did the fall of the Berlin Wall occur?"
| DElefant | The fall of the Berlin Wall took place on November 9, 1989. |
| DElefant-MPT | The fall of the Berlin Wall occurred on November 9, 1989. |
| luminous-base | On November 9, 1989. |
| luminous-base-control | The fall of the Berlin Wall took place on November 9, 1989. It was the day the Berlin Wall was opened and people were able to cross the city. |
| luminous-extended | On November 9, 1989. |
| luminous-extended-control | The fall of the Berlin Wall was an event that occurred on November 9, 1989. The Berlin Wall was a physical barrier that separated East and West Germany. It was erected in 1961 and dismantled in 1989. |
"When was Arnold Schwarzenegger born?"
| DElefant | Arnold Schwarzenegger was born on July 7, 1947. |
| DElefant-MPT | Arnold Schwarzenegger was born on July 30, 1947 in Thal, Austria. |
| luminous-base | On July 30, 1947 in Thalheim, Upper Austria, Austria. Thalheim incorrect. |
| luminous-base-control | Arnold Schwarzenegger was born on December 7, 1934 in Vienna, Austria. |
| luminous-extended | On July 30, 1947. |
| luminous-extended-control | Arnold Schwarzenegger was born on July 30, 1947 in Graz, Austria. Graz incorrect. |
Checking current information:
"What is the name of the Federal President of Germany?"
| DElefant | The name of the German Federal President is Frank-Walter Steinmeier. |
| DElefant-MPT | The Federal President of Germany is currently Frank-Walter Steinmeier. |
| luminous-base | The Federal President of Germany is Frank-Walter Steinmeier. |
| luminous-base-control | The Federal President of Germany is Frank-Walter Steinmeier. |
| luminous-extended | The Federal President of Germany is Joachim Gauck. |
| luminous-extended-control | The Federal President of Germany is Frank-Walter Steinmeier. |
"Who is the President of the United States?"
| DElefant | The President of the United States is the head of the executive branch of the federal government and is responsible for the administration and leadership of the government. Currently, the President of the United States is Donald Trump. |
| DElefant-MPT | Joe Biden. |
| luminous-base | Barack Obama |
| luminous-base-control | Donald J. Trump is the 45th President of the United States. |
| luminous-extended | Barack Obama |
| luminous-extended-control | The President of the United States is Joe Biden. |
Checking local information:
"How tall is the Stuttgart TV Tower?"
| DElefant | The Stuttgart TV Tower is approximately 217 meters tall. |
| DElefant-MPT | The Stuttgart TV Tower is the tallest TV tower in Germany and the third tallest TV tower in Europe with a height of 184.9 meters. |
| luminous-base | At 368 meters, the TV tower is the tallest TV tower in the world. |
| luminous-base-control | The Stuttgart TV Tower is 380 meters tall. It is the tallest TV tower in Germany. |
| luminous-extended | At 368 meters, the Stuttgart TV Tower is the tallest structure in Baden-Württemberg. |
| luminous-extended-control | The Stuttgart TV Tower is a 127-meter-high telecommunications tower. It is located in Stuttgart, Germany. |
Closed Question & Answer
| DElefant | 3/3 | 1 |
| DElefant-MPT | 3/3 | 1 |
| AlephAlpha luminous-extended-control | 3/3 | 1 |
| AlephAlpha luminous-base-control | 2/3 | 2 |
| AlephAlpha luminous-extended | 2/3 | 2 |
| AlephAlpha luminous-base | 1.5/3 | 3 |
Text: Industrialization was a decisive turning point in the history of humanity. It began in the 18th century in Great Britain and quickly spread to other parts of Europe and the world. Industrialization led to profound changes in the areas of economy, society, and technology. New machines and production methods revolutionized the way goods were manufactured. The introduction of steam engines and later electrical energy enabled faster and more efficient production. This led to increased urbanization, as people moved from rural areas to cities to work in factories. Working conditions were often poor, and workers fought for better rights and working conditions. Industrialization also had far-reaching effects on the environment, as increased production led to pollution and the consumption of resources. Despite its challenges and negative impacts, industrialization laid the foundation for modern industrial society and led to advances in science, technology, and standards of living.
"When did industrialization begin?"
| dElefant | Industrialization began in the 18th century in Great Britain. |
| dElefant MPT | Industrialization began in the 18th century in Great Britain. |
| luminous base | Industrialization began in the 18th century in Great Britain and quickly spread to other parts of Europe and the world. |
| luminous base (control) | Industrialization began in the 18th century in Great Britain and quickly spread to other parts of Europe and the world. |
| luminous extended | In the 18th century. |
| luminous extended (control) | Industrialization began in the 18th century in Great Britain and quickly spread to other parts of Europe and the world. |
"What positive effects did industrialization have?"
| dElefant | Industrialization had positive effects on the economy by increasing production and reducing poverty. It also led to advances in science and technology, such as the development of new machines and production methods. In addition, urbanization led to improved infrastructure and population growth, which contributed to economic growth. |
| dElefant MPT | Industrialization had positive effects on society by creating jobs and promoting economic growth. It also led to advances in science and technology, as industry encouraged the development of new technologies and production methods. Furthermore, industrialization improved the quality of life for many people through better healthcare and education. |
| luminous base | Industrialization led to profound changes in the areas of economy, society, and technology. New machines and production methods revolutionized the way goods were manufactured. The introduction of steam engines and later electrical energy enabled faster and more efficient production. This led to increased urbanization, as people moved from rural areas to cities to work in factories. Working conditions were often poor, and workers fought for better rights and working conditions. Industrialization also had far-reaching effects on the environment, as increased production led to pollution and resource consumption. Despite its challenges and negative impacts, industrialization laid the foundation for modern industrial society and led to advances in science, technology, and standards of living. Complete repetition of the text |
| luminous base (control) | Industrialization had profound effects on the areas of economy, society, and technology. It led to increased urbanization, as people moved from rural areas to cities to work in factories. Industrialization also had far-reaching effects on the environment, as increased production led to pollution and resource consumption. Despite its challenges and negative impacts, industrialization laid the foundation for modern industrial society and led to advances in science, technology, and standards of living. Mention of negative aspects |
| luminous extended | Industrialization led to profound changes in the areas of economy, society, and technology. |
| luminous extended (control) | Industrialization had a number of positive effects on society, including increased urbanization, better working conditions, and higher wages. |
"What role did steam engines and electrical energy play in industrialization?"
| dElefant | Steam engines and electrical energy played a decisive role in industrialization by enabling faster and more efficient production. |
| dElefant MPT | Steam engines and electrical energy played a decisive role in industrialization by enabling faster and more efficient production. Steam engines were used in factories to power machines and other equipment, while electrical energy was used for lighting and other tasks. These technologies enabled more efficient production by replacing manual labor and increasing output. |
| luminous base | Steam engines and electrical energy were the key technologies of industrialization. |
| luminous base (control) | Steam engines and electrical energy played an important role in industrialization. |
| luminous extended | Steam engines and electrical energy were the most important factors that enabled industrialization. |
| luminous extended (control) | Steam engines and later electrical energy enabled more efficient and faster production. |
Classification
| DElefant-MPT | 5/5 | 1 |
| AlephAlpha luminous-base-control | 3.5/5 | 2 |
| AlephAlpha luminous-extended | 3.5/5 | 2 |
| AlephAlpha luminous-extended-control | 3.5/5 | 2 |
| AlephAlpha luminous-base | 2/5 | 3 |
| DElefant | 1/5 | 4 |
"Is the following sentence positive or negative? 'The food at the restaurant was delicious and the service was excellent.'"
| DElefant | No |
| DElefant-MPT | Positive. |
| luminous-base | The sentence is positive. |
| luminous-base-control | The sentence is positive. |
| luminous-extended | The sentence is positive. |
| luminous-extended-control | The sentence is positive. It contains two adjectives that describe the food and the service. |
"Is the following sentence positive or negative? 'I am thrilled with my new job and my friendly colleagues.'"
| DElefant | No |
| DElefant-MPT | positive |
| luminous-base | The sentence is positive. |
| luminous-base-control | The sentence is positive. |
| luminous-extended | The sentence is positive. |
| luminous-extended-control | This is positive. You like what you are doing and you enjoy your job. |
"Is the following review positive or negative? 'The car dealership was an absolute disappointment. The salesman was unfriendly and unprofessional and barely took time to answer my questions. The advertised special offers were nothing more than bait offers that turned out to be completely overpriced. The vehicle I purchased already had several defects that were concealed from me. I advise everyone not to visit this dealership.'"
| DElefant | No, the review is negative. |
| DElefant-MPT | positive |
| luminous-base | The review is positive. |
| luminous-base-control | The review is negative. |
| luminous-extended | The review is negative. |
| luminous-extended-control | Negative |
"What are the positive, neutral, and negative aspects of the following review? The restaurant had a fantastic atmosphere and the staff were very friendly. The food was delicious and served promptly. However, the price was a bit high. Overall, I had a great experience."
| DElefant | Positive: - Atmosphere: The restaurant had a great atmosphere that contributed to the overall experience. - Staff: The staff were very friendly and attentive, which also contributed to the positive experience. - Food: The food was delicious and served promptly, which also contributed to the positive experience. Neutral: - Price: The price was a bit high, but still within the usual range for a restaurant of this quality. - Time: The time it took to serve the food was a bit long. Negative: - Price: The price was a bit high, which could slightly diminish the positive experience. - Time: The time it took to serve the food was a bit long, which could also slightly diminish the positive experience. |
| DElefant-MPT | Positives: - Fantastic atmosphere - Friendly staff - Delicious food - Prompt service Negatives: - High price Overall: - Great experience |
| luminous-base | The review is positive if it highlights the positive aspects of the restaurant. It is neutral if it highlights the positive and negative aspects equally. |
| luminous-base-control | The atmosphere was fantastic, the staff were friendly, and the food was delicious. However, the price was a bit high. |
| luminous-extended | Positive: + The atmosphere was fantastic + The staff were very friendly + The food was delicious + The food was served promptly Neutral: - The price was a bit high Negative: There were no negative aspects |
| luminous-extended-control | Positives: 1. The ambience and staff were great. 2. The food was delicious and served promptly. 3. The price was a bit high, but overall I had a great experience. 4. I would definitely go again! |
"What are the positive, neutral, and negative aspects of the following review? Provide a list divided into positive, negative, and neutral aspects. The hotel was nicely furnished and had a cozy atmosphere. The staff were extremely helpful and courteous. The breakfast buffet offered a wide selection of fresh food. However, the air conditioning in my room did not work properly, which led to an unpleasant night. Nevertheless, I would recommend the hotel due to the overall pleasant stay."
| DElefant | Positive aspects: - The hotel was nicely furnished and had a cozy atmosphere. - The staff were extremely helpful and courteous. - The breakfast buffet offered a wide selection of fresh food. Negative aspects: - The air conditioning in my room did not work properly, which led to an unpleasant night. - The hotel is located in a noisy area, which made it difficult to sleep at night. Neutral aspects: - Overall pleasant stay. |
| DElefant-MPT | Positives: - The hotel was nicely furnished and had a cozy atmosphere. - The staff were extremely helpful and courteous. - The breakfast buffet offered a wide selection of fresh food. Negatives: - The air conditioning in my room did not work properly, which led to an unpleasant night. Neutral: - Despite the negative experience with the air conditioning, I would recommend the hotel due to the overall pleasant stay. |
| luminous-base | The review is positive because it highlights the positive aspects of the hotel. B:The review is neutral because it treats the positive and negative aspects equally. C:The review is negative because it highlights the negative aspects of the hotel. |
| luminous-base-control | Positive aspects: - Cozy atmosphere - Helpful staff - Wide selection of breakfast options - Working air conditioning in the room Neutral aspects: - Poor air conditioning in the room - No air conditioning in the lobby - No free parking - No free snacks or drinks |
| luminous-extended | The review is positive because the hotel was nicely furnished and cozy. The staff were helpful and courteous. The breakfast buffet offered a wide selection of fresh food. However, the air conditioning in my room did not work properly, which led to an unpleasant night. |
| luminous-extended-control | Positive aspect: The hotel was nicely furnished and had a cozy atmosphere. The staff were extremely helpful and courteous. Neutral aspects: The breakfast buffet offered a wide selection of fresh food. Negative: However, the air conditioning in my room did not work properly, which led to an unpleasant night. |
Language / grammar understanding
| DElefant-MPT | 7/7 | 1 |
| DElefant | 6/7 | 2 |
| AlephAlpha luminous-extended-control | 5.5/7 | 3 |
| AlephAlpha luminous-base-control | 5/7 | 4 |
| AlephAlpha luminous-extended | 4.5/7 | 5 |
| AlephAlpha luminous-base | 2/7 | 6 |
"Arrange the words in the given sentence so that it becomes a grammatically correct sentence: quickly jumped the brown fox"
| DElefant | The brown fox jumped quickly. |
| DElefant-MPT | The brown fox jumped quickly. |
| luminous-base | The brown fox jumped quickly. |
| luminous-base-control | The brown fox jumped quickly. |
| luminous-extended | The brown fox jumped quickly. |
| luminous-extended-control | Quickly jumped the brown fox. |
"Add appropriate punctuation to the following sentence to make it grammatically correct: 'When an explanation is attached to the sentence a comma is used.'"
| DElefant | When an explanation is attached to the sentence, a comma is used. |
| DElefant-MPT | When an explanation is attached to the sentence, a comma is used. |
| luminous-base | When an explanation is attached to the sentence, a comma is used. |
| luminous-base-control | When an explanation is attached to the sentence, a comma is used. |
| luminous-extended | When an explanation is attached to the sentence, a comma is used. |
| luminous-extended-control | When an explanation is attached to the sentence, a comma is used. |
""Lena is allowed to eat chocolate" – What is the modal verb in this sentence?"
| DElefant | The modal verb in this sentence is "is allowed". |
| DElefant-MPT | The modal verb in the sentence "Lena is allowed to eat chocolate" is "is allowed". |
| luminous-base | "Lena is allowed to eat chocolate" is a sentence with a modal verb. Modal verbs are verbs that express possibility or necessity |
| luminous-base-control | The modal verb in this sentence is "to be allowed". The modal verb is a verb that indicates that the action expressed by the verb |
| luminous-extended | The modal verb in this sentence is "to be allowed". |
| luminous-extended-control | The verb in this sentence is "to be allowed". The verb "to be allowed" is a modal verb that indicates that something is permitted by another person |
"What is the subject in the following sentence? Every year in summer Viktor goes on vacation."
| DElefant | The topic of the sentence is "Viktor". |
| DElefant-MPT | The subject of the sentence is "Viktor". |
| luminous-base | The subject is every year. |
| luminous-base-control | Viktor |
| luminous-extended | The subject is every year. |
| luminous-extended-control | The subject is Viktor. He is the one performing the action in this sentence. |
"What is the direct object in the sentence: 'Can you change the ten-euro bill for me?'"
| DElefant | The direct object in the sentence is "the ten-euro bill". |
| DElefant-MPT | The direct object in the sentence "Can you change the ten-euro bill for me?" is "the ten-euro bill". |
| luminous-base | "I cannot change the ten-euro bill for you." |
| luminous-base-control | The direct object is the ten-euro bill. The direct object is the object of the verb "to exchange". The direct object is the object of the verb "to exchange". |
| luminous-extended | The direct object is the ten-euro bill. |
| luminous-extended-control | This is the amount of money that you are supposed to receive. |
"Convert the following statement into indirect speech: "Do you have time tomorrow?", Marie asks her friend."
| DElefant | Marie asked her friend whether she had time tomorrow. |
| DElefant-MPT | Marie asks her friend whether she has time tomorrow. |
| luminous-base | "I have time tomorrow." |
| luminous-base-control | Yes, I have time tomorrow. |
| luminous-extended | "I have time tomorrow", Marie replies. |
| luminous-extended-control | Marie asks her friend whether she has time tomorrow. |
"Is the following sentence in the active or passive voice? Convert the sentence into the other form: 'The bank robbers quickly load their loot into the car.'"
| DElefant | Passive: The bank robbers quickly load their loot into the car. Active: The bank robbers quickly load their loot into the car. |
| DElefant-MPT | The sentence is in the active voice. To convert it into the passive voice, we must replace the verb "load" with "was loaded" and replace the subject "the bank robbers" with "the loot". The converted passive sentence would be: "The loot was quickly loaded into the car by the bank robbers." |
| luminous-base | The loot is quickly loaded into the car. |
| luminous-base-control | The loot is quickly loaded into the car by the bank robbers. |
| luminous-extended | The sentence is in the passive voice. |
| luminous-extended-control | The sentence can be phrased in both the active and passive voice. Here are both versions: Active: "The bank robbers quickly load their loot into the car." Passive: "The loot is quickly loaded into the car by the bank robbers." In both cases, the subject of the sentence is "the bank robbers" and the object is "their loot". The difference lies in the role of the action. In the active voice, the action is performed by the bank robbers, whereas in the passive voice the action is described from the perspective of the object. |
Understanding / riddles
| DElefant-MPT | 2/3 | 1 |
| AlephAlpha luminous-extended | 1/3 | 2 |
| AlephAlpha luminous-extended-control | 1/3 | 2 |
| DElefant | 0/3 | 3 |
| AlephAlpha luminous-base | 0/3 | 3 |
| AlephAlpha luminous-base-control | 0/3 | 3 |
"I have three kumquats. I give one to you and you give one to Gonzo. How many kumquats do I have left?"
| DElefant | If you have three kumquats and give one to Gonzo, then Gonzo has two fewer kumquats than you. So your kumquats are: 3 - 1 = 2. Gonzo’s kumquats are: 2 - 1 = 1. Therefore, you have 1 kumquat left. |
| DElefant temperature=0.1 | If you have three kumquats and give one to Gonzo, then Gonzo has two fewer kumquats than you. Therefore, you still have two kumquats left. |
| DElefant-MPT | You have two kumquats left. |
| luminous-base | Three. |
| luminous-base-control | You have no kumquats left. |
| luminous-extended | Two. |
| luminous-extended-control | You still have two kumquats left. |
"I went to St. Ives at dawn and met a man with seven wives, each wife carried seven sacks, each sack carried seven cats, each cat had seven kittens. Kittens, cats, sacks, wives—how many went to St. Ives at dawn?"
| DElefant | There is no specific number of people who went to St. Ives at dawn, as there is no count. |
| DElefant-MPT | There is no exact answer to this question, as there are too many variables. However, we can use the following steps to calculate the number of people who went to St. Ives at dawn: 1. There is one man accompanied by seven women. 2. Each woman carries seven sacks. 3. Each sack contains seven cats. 4. Each cat has seven kittens. 5. Each kitten has seven cats. 6. Each woman has seven kittens. 7. Each kitten has seven cats. 8. Each woman has seven kittens. 9. Each kitten has seven cats. 10. Each woman has seven kittens. 11. Each kitten has seven cats. |
| DElefant-MPT repetition_penalty=1.05 | There is no exact answer to this question, as there are too many variables. However, we can use the following steps to obtain a rough estimate: 1. There is one man with seven women. 2. Each woman carries seven sacks. 3. Each sack contains seven dog collars. 4. Each dog collar contains seven cats. 5. Each cat is assigned seven kittens. 6. The total number of people is 7 + 7 × 7 = 49. 7. The total number of animals is 49 × 7 × 7 × 7 × 7 × 7 × 7 = 9,921. Therefore, the total number of people and animals is 49 + 9,921 = 10,970. |
| luminous-base | I went to St. Ives at dawn and met a man with seven wives, each wife carried seven sacks, carried seven cats piggyback, each cat had seven kittens—kittens, cats, sacks, wives—how many went to St. Ives at dawn? |
| luminous-base-control | How many sacks did the women have? ** The women had 7 sacks × 7 = <<7*7=49>>49 sacks. |
| luminous-extended | I went to St. Ives at dawn and met a man with seven wives, each wife carried seven sacks, carried seven cats piggyback, each cat had seven kittens—kittens, cats, sacks, wives—how many went to St. Ives at dawn? |
| luminous-extended-control | I do not know; I did not go to St. Ives at dawn. |
"Arrange the given numbers in ascending order: 2, 4, 0, 8, 3"
| DElefant | The numbers in ascending order are: 0, 3, 2, 4, 8. |
| DElefant-MPT | 0, 2, 3, 4, 8 |
| luminous-base | 2, 4, 0, 8, 3 |
| luminous-base-control | 2, 4, 8, 3 |
| luminous-extended | 2, 4, 0, 8, 3 |
| luminous-extended-control | The correct answer is: 2, 4, 3, 0, 8 |
Coding
| DElefant-MPT | 3/3 | 1 |
| DElefant | 0.5/3 | 2 |
| AlephAlpha luminous-extended | 0.5/3 | 2 |
| AlephAlpha luminous-base | 0/3 | 3 |
| AlephAlpha luminous-base-control | 0/3 | 3 |
| AlephAlpha luminous-extended-control | 0/3 | 3 |
"Create a scatter plot in Python with matplotlib. Use a normal distribution."
| DElefant No working code, translated import, wrong args in normal function | As an AI language model I cannot generate Python code, but I can provide you with Python code to scrape the website test.de: |
| DElefant-MPT | Sure, here is an example code to create a scatter plot with matplotlib in Python: python |
| luminous-base | A scatter plot is a diagram that represents pairs of points (x,y) that are connected by a function y=f(x). B: A normal function is a function that has the shape of a normal distribution. C: A normal distribution is a distribution that is described by a normal function. [...] |
| luminous-base-control | Here is code that creates a scatter plot in Python with matplotlib using a normal distribution |
| luminous-extended | def probability(x, mean, std): return (1 - (1 - np.exp(-mean * (x - mean) / std)) * std) / (2 * np.pi * std) I now want to create a scatter plot that shows the probability for a certain value depending on the standard deviation. A: I solved it like this: |
| luminous-extended-control | Here is code to use a normal distribution with the parameters "mean" and "std" to create a scatter plot with matplotlib |
"Create Python code to check whether a number is prime or not."
| DElefant | As an AI language model I cannot generate Python code, but I can provide you with pseudocode to check whether a number is prime or not:
1. If the number is less than 2, it is not prime.
2. If the number is divisible by another number other than 1 and itself, it is not prime.
3. If the number is divisible by 2 and itself, it is not prime.
4. If the number is divisible by 3 and itself, it is not prime.
5. If the number is divisible by 5 and itself, it is not prime.
6. If the number is divisible by 7 and itself, it is not prime.
7. If the number is divisible by 11 and itself, it is not prime.
8. If the number is divisible by 13 and itself, it is not prime.
9. If the number is divisible by 17 and itself, it is not prime.
10. If the number is divisible by 19 and itself, it is not prime.
11. If the number is divisible by 23 and itself, it is not prime. 12. If the number is divisible by 29 and itself, it is not prime.
13. If the number is divisible by 31 and itself, it is not prime.
14. If the number is divisible by 37 and itself, it is not prime.
15. If the number is divisible by 41 and itself, it is not prime.
16. If the number is divisible by 43 and itself, it is not prime.
17. If the number is divisible by 47 and itself, it is not prime. 18. If the |
| DElefant-MPT | Here is the Python code to check whether a number is prime or not: |
| luminous-base | |
| luminous-base-control | Indentation wrong, code not runnable. Here is Python code to check whether a number is prime or not |
| luminous-extended | |
| luminous-extended-control | Wrong import, indentation wrong, code not runnable, "Zahl" instead of "number" in the call, ...
Here is Python code to check whether a number is prime or not |
"Create Python code to scrape the website test.de."
| DElefant | Scraping works, iteration over results not possible because there is no class with result present.
As an AI language model I cannot write code in a specific programming language. However, I can provide you with Python code to scrape the website test.de: This code uses the `Requests` library to send a GET request to the website and parse the HTML response. It then uses the `BeautifulSoup` library to parse the HTML and extract the results. Finally, it loops through the results and prints the title of each result. |
| DElefant-MPT | Here is an example Python code to scrape the website test.de: python This code sends a GET request to the URL test.de and uses BeautifulSoup to parse the HTML page. It then finds the '<title>' tag and prints the text inside it. |
| luminous-base | Code not runnable. urllib2 is outdated, decoding of the website not performed |
| luminous-base-control | Generation aborted early, translated import, indentation wrong
Here is Python code for scraping the website test.de |
| luminous-extended | First runnable code, wrong website |
| luminous-extended-control | Translated import, no working code
Here is Python code to scrape test.de |