大数据不等于科学规律 | 诺奖得主Wilczek专栏

科技工作者之家 2019-10-09

​​来源 | 公众号“蔻享学术”

作者 | Frank Wilczek (麻省理工学院教授、2004年诺贝尔奖得主)

翻译 | 梁丁当胡风

我们从天文学史获得的重要教训是,大数据本身是解释不了自己的。构建简化的数学模型,再将其与真实的物理世界联系起来,并加以完善,这才是从数据这块原始矿石中提炼出“意义”这颗稀有宝石的可靠方法。

天文学史表明,如果没有理论模型的解释,观测数据本身揭示的信息是有限的。

The history of astronomy shows that observations can only explain so much without the interpretive frame of theories and models.

如今,大数据和机器学习为许多科学问题提供了新的解决方法。而天文学史为我们提供了一个有趣的角度去审视如何运用数据引导科学,这或许是一个很好的警示。

Big data and machine learning are powering new approaches to many scientific questions. But the history of astronomy offers an interesting perspective on how data informs science—and perhaps a cautionary tale.

早期的巴比伦天文学家采用了今天我们称之为纯“大数据”或者“模式识别”的方法。他们积累了数个世纪的太阳、月球和行星运动及日月食的观测数据,从中找出了不同的循 环周期。只需假设这些周期会继续下去,他们就能为种植、灌溉和收割的时间提供合理指导,制定出可靠的占星术,并提前预测月食发生的时间。

Early Babylonian astronomers took what today we'd call a pure "big data" or "pattern recognition" approach. They accumulated observations of solar, lunar and planetary motion and eclipses for many centuries and identified various cycles that had repeated many times. Simply by assuming that those cycles would continue, they were able to give good advice for planting, irrigation and harvest times, to cast credible horoscopes and to predict in advance when lunar eclipses would occur.

古希腊天文学家则用了两种不同的方法来理解同一组数据。第一种方法是构建几何模型,即将太阳、月亮、行星和恒星视为一个个抽象的发光点,分别固定在某个匀速旋转的天球上。

The ancient Greek astronomers used two distinct methods to understand the same data set. The first was to make geometric models that treated the sun, moon, planets and stars as mathematical abstractions—shiny points carried upon uniformly rotating celestial spheres.

起初,希腊人的预测并不比巴比伦人强,事实上差很多。为了改进,他们假设光点在天球上不是固定的,还在沿着额外的圆周轨道运动,即本轮。公元2世纪时,这种模型体系在天文学家托勒密(Ptolemy)手中臻于完美。尽管在后人看来,托勒密的模型是冗杂笨拙的,但在当时,它确实提供了一种相对紧凑的框架体系来包容大量的天文数据,并且给出了有用的实际结果。

At first, the Greeks' predictions were no better than those of the Babylonians—in fact, they were significantly worse. But they patched things up by postulating additional movements of the spheres, called epicycles. These models, which were perfected by the 2nd-century astronomer Ptolemy, seem ugly in retrospect, but they did package the astronomical data in a relatively compact form, and they gave useful practical results.


希腊天文学家采用的第二种方法是将天体视为具有物理性质的真实物体。这种方法的一个代表性成就是:公元前3世纪时,阿里斯塔克(Aristarchus)首次测算出了日地距离与地月距离的比值。阿里斯塔克假设月光来自反射的太阳光,当半个月亮和太阳同时出现在天空的时候,他利用简单的三角原理计算出了两者距离的比值。

The second method used by Greek astronomers was to consider astronomical bodies as real objects with physical properties. Perhaps the high point of this effort was the brilliant determination by Aristarchus, in the 3rd century B.C., of the ratio of the distances from the Earth to the sun and the moon. Assuming that the moon shines by reflected sunlight, and measuring the angle between the sun and the half-moon when both are visible in the sky, he calculated the ratio using simple trigonometry.

然而在数个世纪里,上述两种天文学方法——一个是数学的,一个是物理的——一直没能很好地结合起来。这是因为已有的“大数据”,即太阳、月亮和恒星那些容易观测到的运行模式,只不过是深层规律呈现出来的隐晦表象。

Yet a proper synthesis of the mathematical and physical approaches to astronomy wasn’t achieved for many centuries. That’s because the available "big data"-the easily observable patterns of the sun, moon and stars-are cryptic, superficial signs of the deep structure beneath.

16世纪时,哥白尼(Copernicus)发现,如果把太阳而不是地球放在天球的中心,就可以得到一个更加简洁漂亮的托勒密式模型。虽然托勒密模型在科学史上常常不受待见,但该模型在哥白尼的突破中起到了绝对关键的作用,因为它为模型参数之间的“巧合”提供了物理的解释。

Copernicus, in the 16th century, discovered that he could get more beautiful versions of Ptolemy-style models if he put the sun, rather than the Earth, at the center of the celestial spheres. Ptolemy's work typically gets rough treatment in the history of science, but it was absolutely essential to Copernicus's breakthrough in offering a physical explanation of "coincidences" among the model's parameters.

在哥白尼提出日心说后不久,伽利略(Galileo)就利用自制的望远镜,成功观测到了金星的相位变化、木星的卫星——一个缩微的“太阳系”,以及月球的表面地貌。夜空不再是抽象几何点和虚拟球面的数学模型,而是一个向我们展示实实在在的天体的窗口。最终,当牛顿提炼出了运动与引力的普遍规律后,巴比伦人和托勒密的“大数据”方法与阿里斯塔克和伽利略的物理终于被结合起来,从而开启了真正的现代科学。

Not long after, Galileo's homemade telescope revealed the phases of Venus, Jupiter's attendant satellites—a "solar system" in miniature—and the topography of the moon. The night sky came to life as a showcase of tangible, physical bodies rather than an exercise in idealized points and imaginary spheres. When Isaac Newton distilled the universal laws of motion and gravity, he reunited the "big data" approach of the Babylonians and Ptolemy with the physics of Aristarchus and Galileo, launching truly modern science.

我们从天文学史获得的重要教训是,大数据本身是解释不了自己的。构建简化的数学模型,再将其与真实的物理世界联系起来,并加以完善,这才是从数据这块原始矿石中提炼出“意义”这颗稀有宝石的可靠方法。

The big lesson is that big data doesn't interpret itself. Making mathematical models, trying to keep them simple, connecting to the fullness of reality and aspiring to perfection—these are proven ways to refine the raw ore of data into precious jewels of meaning.

作者简介

Frank Wilczek:弗兰克·维尔切克是麻省理工学院物理学教授、量子色动力学的奠基人之一。因在夸克粒子理论(强作用)方面所取得的成就,他在2004年获得了诺贝尔物理学奖。

特 别 提 示

1. 进入『返朴』微信公众号底部菜单“精品专栏“,可查阅不同主题系列科普文章。

2. 『返朴』提供按月检索文章功能。关注公众号,回复四位数组成的年份+月份,如“1903”,可获取2019年3月的文章索引,以此类推。

来源:fanpu2019 返朴

原文链接:http://mp.weixin.qq.com/s?__biz=MzUxNzQyMjU5NQ==&mid=2247486993&idx=2&sn=bc10e25e4761a4224a426adbe2cb54ae&chksm=f999257dceeeac6b075d6bf033e61037fd8fcce2f65510783aac02b34369f9d4440fb47daeff&scene=27#wechat_redirect

版权声明:除非特别注明,本站所载内容来源于互联网、微信公众号等公开渠道,不代表本站观点,仅供参考、交流、公益传播之目的。转载的稿件版权归原作者或机构所有,如有侵权,请联系删除。

电话:(010)86409582

邮箱:kejie@scimall.org.cn

天文 返朴

推荐资讯