Create an account


Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fedora - Use Diffoscope in packager workflows

#1
Use Diffoscope in packager workflows

In the role of a packager, updating packages is a recurring task. For some projects, a packager is involved in upstream maintenance, or well written release notes make it easy to figure out what changed between the releases. This isn’t always the case, for instance with some small project maintained by one or two people somewhere on github, and it can be useful to verify what exactly changed. Diffoscope can help determine the changes between package releases.

Diffoscope is a “smart binary diff” tool that was born in the Reproducible Builds project in Debian, which is also available in Fedora. It “knows” about various types of text and binary formats, and will try to recursively unpack and compare two blobs. In particular it knows that some objects need to be decompressed before comparing, that archives need to be unpacked, and how to deconstruct binary objects like ELF programs and libraries, Java .jar files, Windows .cab files, etc.

Just today I received a bug report stating that there is a new version of python-libarchive-c available (3.2, while 3.1 is what is currently packaged). It is a simple Python package. But even a simple Python package has some binary files, so a straightforward diff on the unpackaged rpms doesn’t really work. Let’s see how Diffoscope can be used to show the differences between the binary packages in detail.

Comparing upstream archives


The first step is to compare the upstream archives:

$ diffoscope python-libarchive-c-3.{1,2}.tar.gz
+++ python-libarchive-c-3.2.tar.gz
│ --- python-libarchive-c-3.1.tar ❶
├── +++ python-libarchive-c-3.2.tar
│ ├── file list
│ │ @@ -1,46 +1,46 @@ ❷
│ │ -drwxrwxr-x 0 root (0) root (0) 0 2021-06-01 07:32:24.000000 python-libarchive-c-3.1/ │ │ --rw-rw-r-- 0 root (0) root (0) 25 2021-06-01 07:32:24.000000 python-libarchive-c-3.1/.gitattributes
│ │ -drwxrwxr-x 0 root (0) root (0) 0 2021-06-01 07:32:24.000000 python-libarchive-c-3.1/.github/
│ │ --rw-rw-r-- 0 root (0) root (0) 20 2021-06-01 07:32:24.000000 python-libarchive-c-3.1/.github/FUNDING.yml
...
│ │ --rw-rw-r-- 0 root (0) root (0) 1331 2021-06-01 07:32:24.000000 python-libarchive-c-3.1/version.py
│ │ +drwxrwxr-x 0 root (0) root (0) 0 2021-10-06 12:40:03.000000 python-libarchive-c-3.2/
│ │ +-rw-rw-r-- 0 root (0) root (0) 25 2021-10-06 12:40:03.000000 python-libarchive-c-3.2/.gitattributes
│ │ +drwxrwxr-x 0 root (0) root (0) 0 2021-10-06 12:40:03.000000 python-libarchive-c-3.2/.github/
│ │ +-rw-rw-r-- 0 root (0) root (0) 20 2021-10-06 12:40:03.000000 python-libarchive-c-3.2/.github/FUNDING.yml
...
│ │ +-rw-rw-r-- 0 root (0) root (0) 1331 2021-10-06 12:40:03.000000 python-libarchive-c-3.2/version.py
...
│ │ --- python-libarchive-c-3.1/libarchive/ffi.py
│ ├── +++ python-libarchive-c-3.2/libarchive/ffi.py
│ │┄ Files 0% similar despite different names
│ │ @@ -43,15 +43,15 @@
│ │ SEEK_CALLBACK = CFUNCTYPE(
│ │ - c_longlong, c_int, c_void_p, c_longlong, c_int
│ │ + c_longlong, c_void_p, c_void_p, c_longlong, c_int
│ │ )
│ │ --- python-libarchive-c-3.1/libarchive/read.py
│ ├── +++ python-libarchive-c-3.2/libarchive/read.py
│ │┄ Files 2% similar despite different names
│ │ @@ -61,17 +61,18 @@
│ │ close_cb = CLOSE_CALLBACK(close_func) if close_func else NO_CLOSE_CB
│ │ + seek_cb = SEEK_CALLBACK(seek_func)
│ │ with new_archive_read(format_name, filter_name, passphrase) as archive_p:
│ │ if seek_func:
│ │ - ffi.read_set_seek_callback(archive_p, SEEK_CALLBACK(seek_func))
│ │ + ffi.read_set_seek_callback(archive_p, seek_cb)
│ │ ffi.read_open(archive_p, None, open_cb, read_cb, close_cb)
│ │ yield archive_read_class(archive_p)
...
│ │ --- python-libarchive-c-3.1/libarchive/write.py
│ ├── +++ python-libarchive-c-3.2/libarchive/write.py
│ │┄ Files identical despite different names
│ │ --- python-libarchive-c-3.1/setup.py ❹
│ ├── +++ python-libarchive-c-3.2/setup.py
│ │┄ Files identical despite different names
...
│ │ --- python-libarchive-c-3.1/version.py
│ ├── +++ python-libarchive-c-3.2/version.py
│ │┄ Files 1% similar despite different names
│ │ @@ -9,15 +9,15 @@
│ │ def get_version():
│ │ # Return the version if it has been injected into the file by git-archive
│ │ - version = tag_re.search('HEAD -> master, tag: 3.1')
│ │ + version = tag_re.search('HEAD -> master, tag: 3.2')
│ │ if version:
│ │ return version.group(1)

At ❶ we see that we’re comparing two archives. (Note: this diff output has been heavily trimmed for readability.)

At ❷ we see that the listings of both archives are different, but the difference is as expected: the version number is included in the name of the top-level directory in the archives, so all paths are different. We also see that the two archives were created at different dates (2021-06-01 07:32:24 and 2021-10-06 12:40:03 respectively). This listing would alert us if upstream added or removed files unexpectedly.

At ❸ we see that there were some code changes in the SEEK_CALLBACK function.
Upstream release notes about it say that “this release fixes the seek callbacks passed to libarchive by the custom_reader and stream_reader functions”, so that change looks reasonable.

Most files are not changed, so at ❹ diffoscope dutifully reports that they are the same despite the file name change… And at ❺ we get another version releated change.

And that’s it — no more changes in the upstream tarball.
Let’s build the package then and compare it with a previous build.

Comparing binary packages


After adjusting the spec file for the new version, the second step is to build the package and compare with an older version. fedpkg mockbuild puts the resulting packages in a subdirectory named after the version:

$ fedpkg mockbuild
...
INFO: Results and/or logs in: ~/fedora/python-libarchive-c/results_python-libarchive-c/3.2/1.fc36 $ diffoscope results_python-libarchive-c/3.1/3.fc36/python3-libarchive-c-3.1-3.fc36.noarch.rpm \ results_python-libarchive-c/3.2/1.fc36/python3-libarchive-c-3.2-1.fc36.noarch.rpm
+++ results_python-libarchive-c/3.2/1.fc36/python3-libarchive-c-3.2-1.fc36.noarch.rpm
├── header
│ @@ -1,79 +1,79 @@ ❶
-HEADERIMMUTABLE: 0000003d00001ed50000003f...1000003e8000000060000
+HEADERIMMUTABLE: 0000003d00001e8d0000003f...1000003e8000000060000

In the case of the binary packages, there are many more differences than in the upstream tarball. But let’s try to go through them.

│ HEADERI18NTABLE:
│ - C
-SIGSIZE: 26733
-SIGMD5: 1aa148ac91484fe8cb55fe3334aae10b
-SHA1HEADER: 1659a1431af930a0a824c193780e27f28fc2d03e
-SHA256HEADER: 60e4f84e905bd42693cabe88e63542916a7dfffef052f3e7499cb80a1770c736
+SIGSIZE: 26678
+SIGMD5: e35e3157e01b6ec26b8e1981f0ba38af
+SHA1HEADER: faab0b7ee86b23f753b2c49a52d6a8f3deefc7ca
+SHA256HEADER: 66d39a70dc9e081ba5cc4243e72f434e953d68c9bce8be1127b2b87fa1923d06
│ NAME: python3-libarchive-c
-VERSION: 3.1
-RELEASE: 3.fc36
+VERSION: 3.2
+RELEASE: 1.fc36
│ SUMMARY: Python interface to libarchive
│ DESCRIPTION: The libarchive library provides a flexible interface for reading and writing archives in various
│ formats such as tar and cpio. libarchive also supports reading and writing archives compressed using
│ various compression filters such as gzip and bzip2. A Python interface to libarchive. It uses the
│ standard ctypes module to dynamically load and access the C library.
-BUILDTIME: 1638015932+BUILDTIME: 1638015863
│ BUILDHOST: spora.local
-SIZE: 68979
+SIZE: 69052
│ LICENSE: CC0
│ GROUP: Unspecified
│ URL: https://github.com/Changaco/python-libarchive-c
│ OS: linux
│ ARCH: noarch

We can see that Diffoscope does something similar to rpmdiff on the two archives, but with much more detail.
At ❶ we see that the rpm header changed, which is not surprising 😉 At ❷ we get the details of the signatures, and version info at ❸. Build timestamp and rpm size may also be interesting at ❹. Diffoscope then prints a comparison of the FILESIZES, FILEMTIMES, FILEMD5S tables in the rpm headers. This would be useful if we were trying to chase down some unexpected difference between packages.

│ CHANGELOGTIME:
+ - 1638014400
│ - 1627387200
...
- - 1570104000
- - 1566216000
│ CHANGELOGNAME:
+ - Zbigniew Jędrzejewski-Szmek 3.2-1
│ - Fedora Release Engineering - 3.1-2
...
- - Miro Hrončok - 2.8-10
- - Miro Hrončok - 2.8-9
│ CHANGELOGTEXT:
+ - - Version 3.2 (fixes #2027027)
│ - - Second attempt - Rebuilt for https://fedoraproject.org/wiki/Fedora_35_Mass_Rebuild
...
- - - Rebuilt for Python 3.8.0rc1 (#1748018)
- - - Rebuilt for Python 3.8

Here we see that the changelog got trimmed (my entry is added, and two from Miro are dropped). In Fedora we set %_changelog_trimage to 2 years, so even if the spec file defines a longer changelog, in the built package the oldest entries are trimmed away.

Then we get some expected but important differences:

│ PROVIDEVERSION:
- - 3.1-3.fc36
+ - 3.2-1.fc36
│ OBSOLETEVERSION:
- - 3.1-3.fc36
+ - 3.2-1.fc36
...
│ DIRNAMES: ❷
- - /usr/lib/python3.10/site-packages/libarchive_c-3.1-py3.10.egg-info/
+ - /usr/lib/python3.10/site-packages/libarchive_c-3.2-py3.10.egg-info/
...
│ @@ -1 +1 @@
-RPM v3.0 bin i386/x86_64 python3-libarchive-c-3.1-3.fc36
+RPM v3.0 bin i386/x86_64 python3-libarchive-c-3.2-1.fc36
├── content
│ ├── file list
│ │ @@ -1,35 +1,35 @@
│ │ -drwxr-xr-x 1 0 0 0 2021-10-27 12:20:33.000000 ./usr/lib/python3.10/site-packages/libarchive
│ │ --rw-r--r-- 1 0 0 601 2021-06-01 07:32:24.000000 ./usr/lib/python3.10/site-packages/libarchive/__init__.py
│ │ -drwxr-xr-x 1 0 0 0 2021-10-27 12:20:34.000000 ./usr/lib/python3.10/site-packages/libarchive/__pycache__
...
│ │ +drwxr-xr-x 1 0 0 0 2021-11-27 12:24:23.000000 ./usr/lib/python3.10/site-packages/libarchive
│ │ +-rw-r--r-- 1 0 0 601 2021-10-06 12:40:03.000000 ./usr/lib/python3.10/site-packages/libarchive/__init__.py
│ │ +drwxr-xr-x 1 0 0 0 2021-11-27 12:24:24.000000 ./usr/lib/python3.10/site-packages/libarchive/__pycache__
...
│ ├── ./usr/lib/python3.10/site-packages/libarchive/ffi.py
│ │ @@ -43,15 +43,15 @@
│ │ SEEK_CALLBACK = CFUNCTYPE(
│ │ - c_longlong, c_int, c_void_p, c_longlong, c_int
│ │ + c_longlong, c_void_p, c_void_p, c_longlong, c_int
│ │ )
│ │ @@ -61,17 +61,18 @@
│ │ close_cb = CLOSE_CALLBACK(close_func) if close_func else NO_CLOSE_CB
│ │ + seek_cb = SEEK_CALLBACK(seek_func)
│ │ with new_archive_read(format_name, filter_name, passphrase) as archive_p:
│ │ if seek_func:
│ │ - ffi.read_set_seek_callback(archive_p, SEEK_CALLBACK(seek_func))
│ │ + ffi.read_set_seek_callback(archive_p, seek_cb)
│ │ ffi.read_open(archive_p, None, open_cb, read_cb, close_cb)
...

At ❶ we see that the package version change is reflected in the PROVIDEVERSION and OBSOLETEVERSION tables in the rpm header. There are also PROVIDENAME and OBSOLETENAME tables, but those are unchanged in this case. This is good: we are not expecting any changes to Provides or Obsoletes, except for the version bump.

At ❷ we again see that the version is reflected in a path.

You can see ❸ and ❹ are the heavily trimmed lists of files in both packages. All files are reported as different because the modification time changed. But please look closely at the timestamps: __init__.py is a file provided by upstream, and the mtime is preserved during installation, so we see the same timestamps as in the first listing from the upstream tarball comparison. But the __pycache__ directory was created during build and has a timestamp that shows when the build was done. We would see the same for other files produced during build.

Now comes the boring part:

│ ├── ./usr/lib/python3.10/site-packages/libarchive/__pycache__/__init__.cpython-310.pyc
│ │ @@ -1,8 +1,8 @@
│ │ -00000000: 6f0d 0d0a 0000 0000 88e2 b560 5902 0000 o..........`Y...
│ │ +00000000: 6f0d 0d0a 0000 0000 2399 5d61 5902 0000 o.......#.]aY...
│ ├── ./usr/lib/python3.10/site-packages/libarchive/__pycache__/entry.cpython-310.pyc
│ │ @@ -1,8 +1,8 @@
│ │ -00000000: 6f0d 0d0a 0000 0000 88e2 b560 0c14 0000 o..........`....
│ │ +00000000: 6f0d 0d0a 0000 0000 2399 5d61 0c14 0000 o.......#.]a....
...
│ │ --- ./usr/lib/python3.10/site-packages/libarchive_c-3.1-py3.10.egg-info/PKG-INFO
│ ├── +++ ./usr/lib/python3.10/site-packages/libarchive_c-3.2-py3.10.egg-info/PKG-INFO
│ │┄ Files 0% similar despite different names
│ │ @@ -1,10 +1,10 @@
│ │ Name: libarchive-c
│ │ -Version: 3.1
│ │ +Version: 3.2
│ │ Summary: Python interface to libarchive
│ │ Home-page: https://github.com/Changaco/python-libarchive-c

… yes, diffoscope shows the diff between the hexdump listings of .pyc files (Python bytecode at ❶ and ❷). Those files were created during the build, so we know that those changes correspond to the changes to the sources shown in previous listings. At the end we again see the version changed in the PKG-INFO file (❸).

Currently, diffoscope does not know how to show Python bytecode in a better way. But it is possible that in the future it will be able to deassemble the bytecode back into some more readable form and show the diff on that. New parsers are regularly being added to diffoscope. For compiled programs, it will already show a diff on the disassembled machine code.

Conclusion


After looking at all those diffs, I think one can say with some confidence that the upgrade from python-libarchive-c-3.1 to python-libarchive-c-3.2 is safe. In particular, it is suitable even for a stable release because it has only a bug fix.

Big shout out to Chris Lamb and the other maintainers of diffoscope.



https://www.sickgaming.net/blog/2021/11/...workflows/
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

[-]
Discord

[-]
Active Threads
[Tut] How To Apply A Function To Each El...
Last Post: xSicKxBot
Today 04:46 PM
» Replies: 0
» Views: 3
(Indie Deal) Exotic Riddles Bundle, Dyin...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
Mobile - Pokémon Sword and Shield myster...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
.NET Framework January 2022 Cumulative U...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
AppleInsider - Philips Hue gains trio of...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
Fedora - Quarkus and Mutiny
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
News - Pokémon Unite Adjustments Arrive ...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 1
Xbox Wire - Besiege Console (Game Previe...
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
News - Vulkan 1.3 Specification Released
Last Post: xSicKxBot
Today 04:45 PM
» Replies: 0
» Views: 0
(Indie Deal) Start saving early with Gea...
Last Post: xSicKxBot
Yesterday 05:23 PM
» Replies: 0
» Views: 42

[-]
Twitter



Discord Server © SickGaming.net 2012-2021