{"id":137184,"date":"2024-07-23T17:17:45","date_gmt":"2024-07-23T16:17:45","guid":{"rendered":"https:\/\/www.mobileworldlive.com\/?p=409559"},"modified":"2024-07-23T17:17:45","modified_gmt":"2024-07-23T16:17:45","slug":"researchers-seek-llm-baseline-in-human-interactions","status":"publish","type":"post","link":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/2024\/07\/23\/researchers-seek-llm-baseline-in-human-interactions\/","title":{"rendered":"Researchers seek LLM baseline in human interactions"},"content":{"rendered":"<p>Researchers backed by a trio of leading US universities proposed a mechanism for evaluating the capabilities of large language models (LLMs) based on human psychology, addressing what they stated is a major problem in benchmarking caused by diverse use cases.<\/p>\n<p>In a study of whether LLMs function as people expect, researchers funded by Harvard University, Massachusetts Institute of Technology (MIT) and the University of Chicago devised a method to evaluate how human generalisations impact their assessment of the AI-related technologies.<\/p>\n<p>MIT explained people \u201cform beliefs\u201d about what we think others \u201cdo and do not know\u201d when we interact, a principle which is then carried into our assessment of how well an LLM performs.<\/p>\n<p>Researchers developed a human generalisation function by \u201casking questions, observing how a person or LLM responds and then making inferences about how that person or model would respond to related questions\u201d.<\/p>\n<p>If an LLM shows it can handle a complex subject, people will expect it to be proficient in related, less-complicated areas.<\/p>\n<p>Models which fall short of this belief \u201ccould fail when deployed\u201d, MIT stated.<\/p>\n<p><strong>Baseline<\/strong><br \/>A survey developed based on whether participants believed a person or LLM would answer related questions correctly or incorrectly yielded \u201ca dataset of nearly 19,000 examples of how humans generalise about LLM performance across 79 diverse tasks\u201d.<\/p>\n<p>The survey found participants were less able to generalise about how LLMs would perform compared to other people, a facet researchers believe could impact the way models are deployed moving forward.<\/p>\n<p>Alex Imas, Professor of behavioural science and economics at the University of Chicago\u2019s Booth School of Business, said the research highlighted a \u201ccritical issue with deploying LLMs for general consumer use\u201d, because people may be put off using the models if they do not fully understand when responses will be accurate.<\/p>\n<p>Imas added the study also provides something of a fundamental baseline for assessing LLM performance, specifically whether they \u201cunderstand the problem they are solving\u201d when giving correct answers, in turn helping to improve performance in real-world scenarios.<\/p>\n<p>The post <a href=\"https:\/\/www.mobileworldlive.com\/ai-cloud\/researchers-seek-llm-baseline-in-human-interactions\/\">Researchers seek LLM baseline in human interactions<\/a> appeared first on <a href=\"https:\/\/www.mobileworldlive.com\/\">Mobile World Live<\/a>.<\/p>\n\n<h2><b>Commercials Cooperation Advertisements:<\/b><\/h2>\r\n<p><br>(1) IT Teacher IT Freelance<br> <\/p>\r\n<a href=https:\/\/itteacheritfreelance.hk\/wordpress><img src=http:\/\/gamefootballmobileanimeiphone.com\/wp-content\/uploads\/2023\/09\/ITTeacherITFreelance-Website.png alt=IT\u96fb\u8166\u88dc\u7fd2 java\u88dc\u7fd2 \u70ba\u5927\u5bb6\u914d\u5c0d\u96fb\u8166\u88dc\u7fd2,IT freelance, \u79c1\u4eba\u8001\u5e2b, PHP\u88dc\u7fd2,CSS\u88dc\u7fd2,XML,Java\u88dc\u7fd2,MySQL\u88dc\u7fd2,graphic design\u88dc\u7fd2,\u4e2d\u5c0f\u5b78ICT\u88dc\u7fd2,\u4e00\u5c0d\u4e00\u79c1\u4eba\u88dc\u7fd2\u548cFreelance\u81ea\u7531\u5de5\u4f5c\u914d\u5c0d\u3002\/><\/a><p><a href=https:\/\/itteacheritfreelance.hk\/wordpress\/index.php\/findteacher>\u7acb\u523b\u8a3b\u518a\u53ca\u5831\u540d\u96fb\u8166\u88dc\u7fd2\u8ab2\u7a0b\u5427! <\/a><br>\r\n\r\n\u7535\u5b50\u8ba1\u7b97\u673a -\u6559\u80b2 -IT \u96fb\u8166\u73ed\u201d ( IT\u96fb\u8166\u88dc\u7fd2 ) \u63d0\u4f9b\u4e00\u500b\u65b9\u4fbf\u7684\u7535\u5b50\u8ba1\u7b97\u673a \u6559\u80b2\u5e73\u53f0, \u70ba\u5927\u5bb6\u914d\u5c0d\u4fe1\u606f\u6280\u672f, \u96fb\u8166 \u8001\u5e2b, IT freelance \u548c programming expert. \u8b93\u5927\u5bb6\u65b9\u4fbf\u5730\u5c31\u80fd\u627e\u5230\u5408\u9069\u7684\u96fb\u8166\u88dc\u7fd2, \u96fb\u8166\u73ed, \u5bb6\u6559, \u79c1\u4eba\u8001\u5e2b.  <br>\r\n\r\nWe are a education and information platform which you can find a IT private tutorial teacher or freelance. <br>\r\n\r\nAlso we provide different information about information technology, Computer, programming, mobile, Android, apple, game, movie, anime, animation\u2026 \r\n<\/p>\n<p><br>(2) ITSec<br> <\/p><a href=https:\/\/itsec.vip><img src=http:\/\/gamefootballmobileanimeiphone.com\/wp-content\/uploads\/2023\/09\/ITSec-Main-Promotion-Image.png alt= https:\/\/itsec.vip\/\r\nSecure Your Computers from Cyber Threats and mitigate risks with professional services to defend Hackers.  \r\nITSec provide IT Security and Compliance Services, including IT Compliance Services, Risk Assessment, IT Audit, Security Assessment and Audit, ISO 27001 Consulting and Certification, GDPR Compliance Services, Privacy Impact Assessment (PIA), Penetration test, Ethical Hacking, Vulnerabilities scan, IT Consulting, Data Privacy Consulting, Data Protection Services, Information Security Consulting, Cyber Security Consulting, Network Security Audit, Security Awareness Training.\/><\/a> \r\n<br><br> \r\n<p><a href=https:\/\/itsec.vip>www.ITSec.vip<\/a> <br> <br> \r\n<p><a href=https:\/\/sraa.com.hk>www.Sraa.com.hk<\/a> <br> <br> \r\n<p><a href=https:\/\/itsec.hk>www.ITSec.hk<\/a> <br> <br> \r\n<p><a href=https:\/\/penetrationtest.hk>www.Penetrationtest.hk<\/a> <br> <br> \r\n<p><a href=https:\/\/itseceu.uk>www.ITSeceu.uk<\/a> <br> <br> \r\nSecure Your Computers from Cyber Threats and mitigate risks with professional services to defend Hackers. <br><br>\r\nITSec provide IT Security and Compliance Services, including IT Compliance Services, Risk Assessment, IT Audit, Security Assessment and Audit, ISO 27001 Consulting and Certification, GDPR Compliance Services, Privacy Impact Assessment (PIA), Penetration test, Ethical Hacking, Vulnerabilities scan, IT Consulting, Data Privacy Consulting, Data Protection Services, Information Security Consulting, Cyber Security Consulting, Network Security Audit, Security Awareness Training. \r\n<br><br>Contact us right away. <br><br>Email (Prefer using email to contact us): <br>SalesExecutive@ITSec.vip<\/p>","protected":false},"excerpt":{"rendered":"<p>Researchers backed by a trio of leading US universities proposed a mechanism for evaluating the capabilities of large language models based on human psychology, addressing what they stated is a major problem in benchmarking caused by diverse use cases.<\/p>\n<p>The post <a href=\"https:\/\/www.mobileworldlive.com\/ai-cloud\/researchers-seek-llm-baseline-in-human-interactions\/\">Researchers seek LLM baseline in human interactions<\/a> appeared first on <a href=\"https:\/\/www.mobileworldlive.com\/\">Mobile World Live<\/a>.<\/p>\n","protected":false},"author":132,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-container-style":"default","site-container-layout":"default","site-sidebar-layout":"default","disable-article-header":"default","disable-site-header":"default","disable-site-footer":"default","disable-content-area-spacing":"default","footnotes":""},"categories":[22],"tags":[61,122,127,129,124,128,125,132,131,133,126,130,123,66,94,88,97,56,64,65,60,112,40,75,95,104,33,120,105,101,98,115,30,29,41,86,70,69,68,72,71,26,118,108,87,46,55,48,52,54,51,50,83,62,58,57,109,35,59,63,85,79,82,96,80,27,81,114,44,42,43,45,38,39,110,117,100,111,116,73,89,90,92,91,93,84,78,37,102,34,36,77,67,74,99,113,119,28,121,32,47,49,53,103,31,76],"class_list":["post-137184","post","type-post","status-publish","format-standard","hentry","category-mobile","tag-airpods","tag-anime","tag-anime-characters","tag-anime-cosplay","tag-anime-edits","tag-anime-merchandise","tag-anime-movies","tag-anime-news","tag-anime-recommendations","tag-anime-reviews","tag-anime-series","tag-anime-streaming","tag-animes","tag-app-store","tag-app-store-samsung","tag-appgallery","tag-appgallery-oneplus","tag-apple","tag-apple-music","tag-apple-tv","tag-apple-watch","tag-bbc-sport","tag-best-mobile-games","tag-bixby","tag-bixby-xiaomi","tag-champions-league","tag-cyberpunk","tag-cyberpunk-2077","tag-fantasy-football","tag-fifa","tag-football","tag-formula-1","tag-fortnite","tag-free-fire","tag-free-mobile-games","tag-freebuds-pro","tag-galaxy-a52","tag-galaxy-note-20","tag-galaxy-s21","tag-galaxy-watch-4","tag-galaxy-z-fold-3","tag-game","tag-games","tag-golf","tag-harmonyos","tag-how-to-backup-iphone","tag-how-to-factory-reset-iphone","tag-how-to-reset-iphone","tag-how-to-restore-iphone","tag-how-to-unlock-iphone","tag-how-to-unlock-iphone-5","tag-how-to-unlock-iphone-6","tag-huawei","tag-ios","tag-ipad","tag-iphone","tag-live-soccer","tag-lol","tag-macbook","tag-macos","tag-mate-40-pro","tag-mi-11-lite","tag-mi-home-security-camera-basic-1080p","tag-mi-home-security-camera-basic-1080p-huawei","tag-mi-smart-band-6","tag-minecraft","tag-miui","tag-mlb-scores","tag-mobile-game-design","tag-mobile-game-development","tag-mobile-game-marketing","tag-mobile-game-monetization","tag-mobile-games","tag-mobile-gaming","tag-nba-scores","tag-nba-standings","tag-nfl","tag-nfl-scores","tag-nhl-scores","tag-one-ui","tag-oneplus","tag-oneplus-9-pro","tag-oneplus-buds-pro","tag-oneplus-nord-ce-5g","tag-oxygenos","tag-p40-pro-plus","tag-poco-x3-pro","tag-pokemon","tag-premier-league","tag-pubg","tag-pubg-mobile","tag-redmi-note-10-pro","tag-samsung","tag-samsung-pay","tag-soccer","tag-sports","tag-steam","tag-steeam","tag-top-10-anime","tag-valorant","tag-when-do-the-iphone-7-come-out","tag-when-does-the-iphone-7-come-out","tag-when-is-the-iphone-7-coming-out","tag-world-cup","tag-xbox-series-x","tag-xiaomi"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/posts\/137184","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/users\/132"}],"replies":[{"embeddable":true,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/comments?post=137184"}],"version-history":[{"count":1,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/posts\/137184\/revisions"}],"predecessor-version":[{"id":137186,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/posts\/137184\/revisions\/137186"}],"wp:attachment":[{"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/media?parent=137184"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/categories?post=137184"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gamefootballmobileanimeiphone.com\/index.php\/wp-json\/wp\/v2\/tags?post=137184"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}