Skip to content

Detect usage of uninitialized digits in Objects/longobject.c (_PyLong_New()) #147988

@vstinner

Description

@vstinner

The _PyLong_New() API does not initialize the digits array. Catching usage of uninitialized digits requires building Python with MSAN (-fsanitize=memory) which is not easy to use in practice.


Example to reproduce the old issue gh-102509.

(1) Apply the patch:

diff --git a/Objects/longobject.c b/Objects/longobject.c
index d416fc1747e..343c03f61ed 100644
--- a/Objects/longobject.c
+++ b/Objects/longobject.c
@@ -170,7 +170,7 @@ long_alloc(Py_ssize_t size)
     Py_ssize_t ndigits = size ? size : 1;
 
     if (ndigits == 1) {
-        result = (PyLongObject *)_Py_FREELIST_POP(PyLongObject, ints);
+        //result = (PyLongObject *)_Py_FREELIST_POP(PyLongObject, ints);
     }
     if (result == NULL) {
         /* Number of bytes needed is: offsetof(PyLongObject, ob_digit) +
@@ -189,7 +189,7 @@ long_alloc(Py_ssize_t size)
     _PyLong_SetSignAndDigitCount(result, size != 0, size);
     /* The digit has to be initialized explicitly to avoid
      * use-of-uninitialized-value. */
-    result->long_value.ob_digit[0] = 0;
+    //result->long_value.ob_digit[0] = 0;
     return result;
 }

(2) Build Python with MSAN (--with-memory-sanitizer):

./configure --with-memory-sanitizer --without-pymalloc CC=clang LD=clang CFLAGS="-O0"
make clean
make

(3) make fails with an error like:

==1336053==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x000000f12cf9 in _PyLong_CompactValue /home/vstinner/python/main/./Include/cpython/longintrepr.h:136:5
    #1 0x000000f23f7f in maybe_small_long /home/vstinner/python/main/Objects/longobject.c:71:27
    #2 0x000000fcdc76 in long_bitwise /home/vstinner/python/main/Objects/longobject.c:5680:24
    #3 0x000000faacbc in long_and /home/vstinner/python/main/Objects/longobject.c:5692:12
    (...)

  Uninitialized value was created by a heap allocation
    #0 0x000000440506 in malloc (/home/vstinner/python/main/_bootstrap_python+0x440506) (BuildId: 9ec2535b6793427eb2f3574683be353b2e36dfef)
    #1 0x00000129301b in _PyMem_RawMalloc /home/vstinner/python/main/Objects/obmalloc.c:65:12
    #2 0x00000129ff39 in PyObject_Malloc /home/vstinner/python/main/Objects/obmalloc.c:1649:12
    #3 0x000000f11278 in long_alloc /home/vstinner/python/main/Objects/longobject.c:181:18
    #4 0x000000fcc37b in long_bitwise /home/vstinner/python/main/Objects/longobject.c:5638:9
    #5 0x000000faacbc in long_and /home/vstinner/python/main/Objects/longobject.c:5692:12
    (...)

You can see that long_bitwise() doesn't initialize ob_digit[0] anymore (because of the patch), and so maybe_small_long() fails on calling _PyLong_CompactValue() which uses ob_digit[0].


I propose changing long_alloc() to initialize digits to a pattern to detect the usage of uninitialized digits when Python is built in debug mode. It should catch bugs like the old issue gh-102509 without having to use the heavy MSAN.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions