Preprints are preliminary versions of research papers that are made available before formal peer review and publication in scientific journals. There are now lots of preprint servers across a wide range of disciplines that provide an easier route to disseminate findings and get feedback compared with journal publication. The visibility of preprints is reflected in the increased altmetric score and number of citations for published articles which are preprinted.
To support an iterative publishing workflow, most preprint servers allow revisions to be submitted. Unlike traditional journal publishing, this allows you to see how research has evolved over time. Analysing changes that occur between preprint versions could give an insight into how preprints are being used and whether they are being improved in response to feedback. To get a better understanding of this I looked into the preprint revision patterns on medRxiv, including the contents of author-provided revision summaries. The code for this analysis is available on GitHub.
The first preprint was published on medRxiv on 25 June 2019, since which over 40k preprints have been posted. The number of new preprints each month peaked during the COVID-19 pandemic, with over 2k preprints posted in May 2020. It has since stabilised at around 1k new preprints per month.

As preprints represent work in earlier stages compared to journal articles, it is expected that preprints will have multiple versions.
Excluding preprints where the first version was published in the last 12 months, 20%, have more than one version. The median time between the first two versions for these preprints is 23 days. The distribution of the number of versions for preprints with more than one version is shown below.

For those that are revised, ~75% only have 2 versions. This suggests preprint servers are primarily used because they offer an easy way to share research rather than because they support updating of research as it evolves in response to feedback. This is supported by a survey by bioRxiv which showed increasing awareness of research was the most common motivation for posting preprints.
For the preprints that are revised multiple times, there are some interesting patterns. Below are the revision histories for the top 25 most revised preprints on medRxiv, ordered by the time between the first and latest version (there are similar results from bioRxiv in this analysis).

The most revised preprint had 24 revisions, which represented regular updates as the underlying data was updated over time. Similarly, the top trace on this chart (the preprint with the longest time between the first and latest version) is this preprint — a regularly updated data report — which over 3 years was updated to show the evolving nature of the COVID-19 pandemic.
The easiest way to see changes between preprint versions is by looking at the metadata for each version. This includes changes to the title, abstract, and author list as well as revision summaries submitted by the authors. 28% of revisions include a change to the title and 23% include a change to the author list.
Revision summaries are useful as there can be major changes between versions, which may include significant changes in the number of words or numbers of tables and figures. The proportion of revisions with an author generated revision summary is shown below. Revision summaries were not present in medRxiv revisions made before June 2020. Once they were supported in July 2020, around 65% of revisions had a summary. As of July 2022, all revisions include a summary.

Whilst revision summaries often indicate small changes such as formatting updates or changes to the title or author list which can be seen in changes in the metadata (e.g this revision), there are also larger changes indicated in the revision summaries, such as:
The preprint revision examples above indicate the potential of a feedback ecosystem based around preprints, but the low proportion of preprints with revisions suggests they are mainly used as a way to share work earlier. Commentary on preprints remains relatively rare. Before the COVID-19 pandemic <10% of preprints on bioRxiv received commentary through the comment sections. Similarly a preprint commenting pilot at PLOS found only a small number of comments were made. Feedback is likely happening through other channels such as email or twitter but this is not as visible as feedback surfaced alongside the manuscripts. Enhancing the visibility and integration of feedback from various channels could improve the feedback ecosystem around preprints and encourage more iterative preprints.