GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses

Deal Major

GPT-4 can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem solving abilities.


GPT-4 is more creative and collaborative than ever before. 

It can generate, edit, and iterate with users on creative and technical writing tasks, such as composing songs, writing screenplays, or learning a user’s writing style...


In a casual conversation, the distinction between GPT-3.5 and GPT-4 can be subtle. The difference comes out when the complexity of the task reaches a sufficient threshold—GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5.

To understand the difference between the two models, we tested on a variety of benchmarks, including simulating exams that were originally designed for humans. We proceeded by using the most recent publicly-available tests (in the case of the Olympiads and AP free response questions) or by purchasing 2022–2023 editions of practice exams. We did no specific training for these exams. A minority of the problems in the exams were seen by the model during training, but we believe the results to be representative—see our technical report for details.

AP Calculus BCAMC 12Codeforces RatingAP English LiteratureAMC 10Uniform Bar ExamAP English LanguageAP ChemistryGRE QuantitativeAP Physics 2USABO Semifinal 2020AP MacroeconomicsAP StatisticsLSATGRE WritingAP MicroeconomicsAP BiologyGRE VerbalAP World HistorySAT MathAP US HistoryAP US GovernmentAP PsychologyAP Art HistorySAT EBRWAP Environmental ScienceExam0%20%40%60%80%100%Estimated percentile lower bound (among test takers)Exam results (ordered by GPT-3.5 performance)gpt-4gpt-4 (no vision)gpt3.5
Simulated exams GPT-4estimated percentile GPT-4 (no vision)estimated percentile GPT-3.5estimated percentile
Uniform Bar Exam (MBE+MEE+MPT)1 298 / 400~90th 298 / 400~90th 213 / 400~10th
LSAT 163~88th 161~83rd 149~40th
SAT Evidence-Based Reading & Writing 710 / 800~93rd 710 / 800~93rd 670 / 800~87th
SAT Math 700 / 800~89th 690 / 800~89th 590 / 800~70th
Graduate Record Examination (GRE) Quantitative 163 / 170~80th 157 / 170~62nd 147 / 170~25th
Graduate Record Examination (GRE) Verbal 169 / 170~99th 165 / 170~96th 154 / 170~63rd
Graduate Record Examination (GRE) Writing 4 / 6~54th 4 / 6~54th 4 / 6~54th
USABO Semifinal Exam 2020 87 / 15099th–100th 87 / 15099th–100th 43 / 15031st–33rd
USNCO Local Section Exam 2022 36 / 60 38 / 60 24 / 60
Medical Knowledge Self-Assessment Program 75% 75% 53%
Codeforces Rating 392below 5th 392below 5th 260below 5th
AP Art History 586th–100th 586th–100th 586th–100th
AP Biology 585th–100th 585th–100th 462nd–85th
AP Calculus BC 443rd–59th 443rd–59th 10th–7th



GPT-4 which is 500 Times More powerful than the current ChatGPT will be Released next week.
The current version of ChatGPT is built on GPT 3.5 with 175 Billion Machine Learning Parameters.

