Plast interview

39. Syscall Hot Path

In a low-latency feed handler, even a single blocking syscall can dominate tail latency. You often must diagnose behavior in production without changing the binary. Consider this minimal socket wrapper on the hot path.

struct Fd {
  int fd;
  ~Fd(){ if(fd>=0) ::close(fd); }
  Fd(const Fd&)=delete;
  Fd(Fd&& o):fd(o.fd){ o.fd=-1; }
};
ssize_t rd(Fd& s, char* b, size_t n) noexcept { return ::recv(s.fd,b,n,0); }

Part 1.

Without modifying the binary, how would you determine whether rd blocks or returns EAGAIN, and what minimal change makes it robust for non-blocking sockets?

Part 2.

(1) strace vs ltrace: key differences and selection criteria?

(2) Why are short reads common on sockets, and how should APIs signal partial data?

(3) What changes with O_NONBLOCK for recv, and how should callers handle EAGAIN?

(4) Does noexcept on rd affect inlining or codegen on hot paths?

(5) When prefer recvmmsg/readv over recv, and how to verify with strace?

Answer

Answer (Part 1)

Attach strace -tt -T -e trace=network -p <pid> to observe recvfrom return codes and durations; long durations imply blocking, -1 EAGAIN indicates non-readiness. Make the wrapper robust by setting non-blocking (fcntl(F_SETFL, flags|O_NONBLOCK)) and mapping errno==EAGAIN to a non-fatal result (e.g., 0 bytes) for callers to retry.

Answer (Part 2)

(1) strace shows syscalls, timings, and errors at the kernel boundary. ltrace shows libc/PLT calls; use it for wrapper-layer behavior.

(2) TCP is a byte stream; kernel scheduling and NIC coalescing yield partial reads. Return bytes read and never assume message completeness.

(3) O_NONBLOCK makes recv return -1/EAGAIN instead of blocking. Use readiness APIs (e.g., epoll) and retry when readable.

(4) noexcept indicates no exceptions, aiding inlining and smaller unwind data. It doesn’t affect errno or syscall semantics.

(5) recvmmsg/readv batch to reduce syscall overhead and improve locality. Verify with strace by fewer calls and recvmmsg entries.