-*-org-*-
See https://bugzilla.redhat.com/show_bug.cgi?id=105371 for details.
Alternatively, use debuginfo to generate configure file.
They don’t contain return type info, which can change the parameter passing convention. We could use it and hope for the best. Also they don’t include the potentially present hidden this pointer.
A technique used in GDB (and in uprobes, I believe), whereby the instruction under breakpoint is moved somewhere else, and followed by a jump back to original place. When the breakpoint hits, the IP is moved to the displaced instruction, and the process is continued. We avoid all the fuss with singlestepping and reenablement.
For PLT hits, only exported prototypes would be considered. For symtab entry point hits, all would be.
This would be useful for replacing the arg1, emt2 etc.
The above format tweaks require that packs that expand to no types at all be supported. If this works, then it should be relatively painless to implement conditionals:
void ptrace(REQ=enum(PTRACE_TRACEME=0,…), |
if[REQ==0](pack(),pack(pid_t, void*, void *))) |
This is of course dangerously close to a programming language, and I think ltrace should be careful to stay as simple as possible. (We can hook into Lua, or TinyScheme, or some such if we want more general scripting capabilities. Implementing something ad-hoc is undesirable.) But the above can be nicely expressed by pattern matching:
void ptrace(REQ=enum[int](…)): |
[REQ==0] => () |
[REQ==1 or REQ==2] => (pid_t, void*) |
[true] => (pid_t, void*, void*); |
Or:
int open(string, FLAGS=flags[int](O_RDONLY=00,…,O_CREAT=0100,…)): |
[(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,…)) |
This would still require pretty complete expression evaluation. Including pointer dereferences and such. And e.g. in accept, we need subtraction:
int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*); |
Perhaps we should hook to something after all.
This is closely related to above. Take the following syscall prototype:
long read(int,+string0,ulong); |
string0 means the same as string(array(char, zero(retval))*). But if read returns a negative value, that signifies errno. But zero takes this at face value and is suspicious:
read@SYS(3 <no return …> |
error: maximum array length seems negative |
, “\n\003\224\003\n”, 4096) = -11 |
Ideally we would do what strace does, e.g.:
read@SYS(3, 0x12345678, 4096) = -EAGAIN |
Some calls result in setting errno. Somehow mark those, and on failure, show errno. System calls return errno as a negative value (see the previous point).
This definitely calls for some general scripting. The goal is to have seconds in adjtimex calls show as e.g. 10s, 1m15s or some such.
Format should take value argument describing the value that should be analyzed. The following overwriting rules would then apply:
format | format(array(char, zero)*) |
format(LENS) | X=LENS, format[X] |
The latter expanded form would be canonical.
This depends on named arguments and parameter pack improvements (we need to be able to construct parameter packs that expand to nothing).
Combination of named arguments and some extensions could take care of that:
void func(X=hide(int*), long*, +pack(X)); |
This would show long* as input argument (i.e. the function could mangle it), and later show the pre-fetched X. The “pack” syntax is utterly undeveloped as of now. The general idea is to produce arguments that expand to some mix of types and values. But maybe all we need is something like
void func(out int*, long*); |
ltrace would know that out/inout/in arguments are given in the right order, but left pass should display in and inout arguments only, and right pass then out and inout. + would be backward-compatible syntactic sugar, expanded like so:
void func(int*, int*, +long*, long*); |
void func(in int*, in int*, out long*, out long*); |
This is useful in particular for:
ulong mbsrtowcs(+wstring3_t, string*, ulong, addr); |
ulong wcsrtombs(+string3, wstring_t*, ulong, addr); |
Where we would like to render arg2 on the way in, and arg1 on the way out.
But sometimes we may want to see a different type on the way in and on the way out. E.g. in asprintf, what’s interesting on the way in is the address, but on the way out we want to see buffer contents. Does something like the following make sense?
void func(X=void*, long*, out string(X)); |
Necessary in particular for:
int mincore(void *addr, size_t length, unsigned char *vec);
… where we need the following prototype:
int mincore(addr, size_t, +array(hex(char), (arg2 + 4095) / 4096)*);
This would be useful for __cxa_throw, presumably also for longjmp (do we handle that at all?) and perhaps a handful of others.
enum-like syntax, except disjunction of several values is assumed.
We currently can’t define time_t on 32bit machines. That mean we can’t describe a range of time-related functions.
Also, don’t format it as characted by default, string lens can do it. Perhaps introduce byte and ubyte and leave ‘char’ as alias of one of those with string lens applied by default.
Really we should keep everything as {u,}int{8,16,32,64} internally, and have long, short and others be translated to one of those according to architecture rules. Maybe this could be achieved by a per-arch config file with typedefs such as:
typedef ulong = uint8_t; |
- ARM and AARCH64 both support half-precision floating point
- there are two different half-precision formats, IEEE 754-2008 and “alternative”. Both have 10 bits of mantissa and 5 bits of exponent, and differ only in how exponent==0x1F is handled. In IEEE format, we get NaN’s and infinities; in alternative format, this encodes normalized value -1S × 2¹⁶ × (1.mant)
- The Floating-Point Control Register, FPCR, controls: — The half-precision format where applicable, FPCR.AHP bit.
- AARCH64 supports fixed-point interpretation of {,double}words
- e.g. fixed(int, X) (int interpreted as a decimal number with X binary digits of fraction).
- AARCH64 supports 128-bit quad words in SIMD
Or even marked __attribute__((pure)).
GDB supports python pretty printers. We migh want to hook this in and use it to format certain types.
- PTRACE_SIEZE
- /proc/PID/map_files/* (but only root seems to be able to read this as of now)
>> I looked into this. Ltrace is definitely leaking it. The regex is >> released when filter_destroy() calls filter_rule_destroy(), but those >> are not called by anything. > >Ah, there we go. I just saw that we call regfree, but didn’t check >whether we actually call those. Will you roll this into your change >set, or should I look into it?
I’d rather you looked at it, if you don’t mind.
Basically we’d like to follow pthread_create always, and fork only if -f is given. ltrace now follows nothing, unless -f is given, and then it follows everything. (Really it follows everything alway and detaches from newly-created children unless -f is given.)
The thing is, in Linux, we can choose to follow only {v,}forks by setting PTRACE_O_TRACE{V,}FORK. We can also choose to follow all clones by setting PTRACE_O_TRACECLONE, but that captures pthread_create as well as fork (as all these are built on top of the underlying clone system call), as well as direct clone calls.
So what would make sense would be to tweak the current logic to only detach if what happened was an actual fork, which we can tell from the parameters of the system call. That might provide a more useful user experience. Tracing only a single thread is problematic anyway, because all the threads will hit the breakpoints that ltrace sets anyway, so pre-emptively tracing all threads is what you generally need.