Sponsor 38 (shadow simulator) update

I've just shared the 2022-04 update in the Shadow forum: https://github.com/shadow/shadow/discussions/2007 Also it looks like I forgot to share the previous 2021-12 update here: https://github.com/shadow/shadow/discussions/1824 Both mirrored below: 2022-04 update This is part of a series of periodic updates of development in Shadow. This work is sponsored by the NSF <https://github.com/shadow/shadow/blob/main/docs/nsf_sponsorship.md>. Previous update: 2021-12 <https://github.com/shadow/shadow/discussions/1824>. We've merged 82 non-dependabot pull requests <https://github.com/shadow/shadow/pulls?q=is%3Apr+merged%3A2021-12-17..2022-04-05+-author%3Aapp%2Fdependabot> and closed 18 issues <https://github.com/shadow/shadow/issues?q=closed%3A2021-12-17..2022-04-05+is%3Aissue> since our previous update. Release status We are continuing to work on Shadow 2.1 <https://github.com/shadow/shadow/projects/5>. The biggest user-facing goal for this release is to support running golang programs in Shadow <https://github.com/shadow/shadow/milestone/30>, especially tor simulations using the snowflake pluggable transport <https://github.com/shadow/shadow/issues/1549>. We've also begun planning the Shadow 2.2 <https://github.com/shadow/shadow/projects/6> release, which will largely be a push to refactor and migrate more of the core Shadow code to Rust. Notable change since last update Emulation accuracy * Optionally move time forward in non-blocking syscalls <https://github.com/shadow/shadow/pull/1995>. Historically Shadow doesn't move time forward except when explicitly waiting for an event, such as for a deadline to pass or for data to arrive over the network. Conceptually, this emulates a system with an infinite number of infinitely fast CPUs. Normally this is sufficient for modeling networks where CPUs aren't expected to be a bottleneck. Unfortunately as we've explored running more software under Shadow, we've found a growing number of examples of code with "busy loops" <https://github.com/shadow/shadow/issues/1792>, which deadlock in this model. This new feature optionally models every syscall taking some small amount of time (e.g. a microsecond), which allows the simulation to escape most such loops. We are still testing and improving this feature, and expect some version of it to be enabled by default in the next release. * Fixed a corner case in getaddrinfo <https://github.com/shadow/shadow/pull/1998>. * Fixed several bugs in handling file descriptors: o Fixed pipe writable state when reader is closed <https://github.com/shadow/shadow/pull/1985>. o Have pipe writers return EPIPE if there are no readers <https://github.com/shadow/shadow/pull/1983>. o Fixed errno for socket with SOCK_SEQPACKET type <https://github.com/shadow/shadow/pull/1972>. o Don't fail on TCP_NODELAY <https://github.com/shadow/shadow/pull/1936>. * Reliably intercept time via vdso <https://github.com/shadow/shadow/pull/1951>. Previously we relied on intercepting calls to VDSO functions (such as |gettimeofday|) to be intercepted at the libc level via |LD_PRELOAD|. However, this doesn't work when the VDSO is used more directly, such as in golang, which would cause the program to get the real-world time instead of the simulated time. We now patch the VDSO itself at program start to reliably intercept these functions in such cases. * Implemented basic signal emulation <https://github.com/shadow/shadow/pull/1881>. This allows managed code to install signal handlers and send and receive signals within the simulation. (Sending a signal from outside the simulation to a managed process is still not supported). Notably this support is required to handle golang programs, and allows simulated processes to be shut down cleanly by scheduling |kill| processes to send appropriate signals. * Implemented the |select| system call <https://github.com/shadow/shadow/pull/1910>. Usability * Added an experimental option for strace-style logging <https://github.com/shadow/shadow/pull/1886> of syscalls made by managed code. * Stabilized the |--progress| <https://github.com/shadow/shadow/pull/1937> option, which periodically updates stderr with the simulation progress. Performance * Nightly shadow benchmarks are now being run to help detect performance regressions. We also have the ability to run the benchmark on our own development branches to investigate performance changes before merging these branches into Shadow. Benchmark results for a 5% Tor network are published publicly <https://github.com/shadow/benchmark-results/tree/master/tor>, but are only intended to be useful for Shadow developers. * Added a library for overriding crypto functions <https://github.com/shadow/shadow/pull/1853>. When enabled, this option overrides some openssl APIs with "no-op" implementations. This is a reimplementation of a feature previously available in Shadow's "tor plugin". It is intended primarily to improve the performance of tor simulations on hardware without accelerated AES operations, and is only supported on Debian 11. Happy simulating! The Shadow team 2021-12 update This is part of a series of periodic updates of development in Shadow. This work is sponsored by the NSF <https://github.com/shadow/shadow/blob/main/docs/nsf_sponsorship.md>. Previous update: 2021-10 <https://github.com/shadow/shadow/discussions/1695>. We've merged 78 pull requests <https://github.com/shadow/shadow/pulls?q=is%3Apr+merged%3A%3E%3D2021-09-30> and closed 17 issues <https://github.com/shadow/shadow/issues?q=closed%3A%3E%3D2021-09-30+is%3Aissue+> since our previous update. Release status We have released Shadow 2.0 <https://github.com/shadow/shadow/releases/tag/v2.0.0>! Give it a try, and let us know if you run into any issues! We have begun work on Shadow 2.1 <https://github.com/shadow/shadow/projects/5>. Notable planned features are support for signals (which is needed <https://github.com/shadow/shadow/issues/1549> to reliably run golang programs under Shadow), and improved Unix socket support. Notable change since last update Emulation accuracy * Add unnamed unix sockets created with socketpair() <https://github.com/shadow/shadow/pull/1787> * Add support for the 'O_DIRECT' flag to pipes (packet mode) <https://github.com/shadow/shadow/pull/1746> * poll and epoll: wait for timeout even with no fd's <https://github.com/shadow/shadow/pull/1697> Performance improvements * Set default min runahead to 1ms <https://github.com/shadow/shadow/pull/1711>. This fixes a performance regression <https://github.com/shadow/shadow/issues/1699> in many-host simulations using multiple CPUs. Stability * Detect and handle death by signals <https://github.com/shadow/shadow/pull/1722> * Handle process exit while it has events scheduled <https://github.com/shadow/shadow/pull/1704> UI improvements * Move host heartbeat options to the experimental options, and disable by default <https://github.com/shadow/shadow/pull/1703> Shadow at Tor We have been collaborating with The Tor Project to use Shadow in its development and testing. We have been running Shadow simulations inside a Gitlab CI pipeline <https://gitlab.torproject.org/jnewsome/sponsor-61-sims> to help develop and tune improved congestion control algorithms in the upcoming 0.4.7 <https://lists.torproject.org/pipermail/tor-project/2021-December/003241.html> release. This is the first major application of Shadow inside the Tor Project itself, and we plan to use the pipeline we've developed for further testing and profiling. We have also been making progress on running Arti (the experimental new Rust implemenation of tor) under Shadow: https://gitlab.torproject.org/tpo/core/arti/-/issues/174. In the course of this work we have fixed some bugs in Arti, fixed an upstream bug in the async-io <https://github.com/smol-rs/async-io/pull/73> crate, and found and identified and fixed several emulation accuracy bugs in Shadow. Happy simulating! The Shadow team
participants (1)
-
Jim Newsome