feat(guest): replace musl with picolibc#831
feat(guest): replace musl with picolibc#831andreiltd wants to merge 10 commits intohyperlight-dev:mainfrom
Conversation
617d834 to
3f8ffcf
Compare
eaf1d54 to
b18a374
Compare
|
will help with #282 |
499c59d to
2ae0a3a
Compare
2ae0a3a to
d3826af
Compare
d2043ac to
2bd1ff4
Compare
src/hyperlight_guest_bin/c/clock.c
Outdated
There was a problem hiding this comment.
did we write this file? is there attribution that is needed?
There was a problem hiding this comment.
If this is really necessary, I think @jprendes just wrote a Rust implementation for this in hyperlight-js. maybe we could use it?
There was a problem hiding this comment.
Yes, those are ours. We cannot use the implementation from hyperlight-js directly, it serves a bit different purpose. That being said I rewrote picolibc stubs to rust. The tradeoff is that some of the types and defines that you normally get from libc headers have to be redifined in Rust with compatible ABI. I think that is OK -- they have to be backwards compatible and I don't see them change anytime soon.
| Some(vec![ParameterValue::ULong(count as u64)]), | ||
| ReturnType::VecBytes, | ||
| ) { | ||
| Ok(bytes) => { | ||
| let n = bytes.len(); | ||
| unsafe { | ||
| core::ptr::copy_nonoverlapping(bytes.as_ptr(), buf as *mut u8, n); |
There was a problem hiding this comment.
should not rely on the host provided length. nonoverlapping could write past the buffer lenghth. We are in the guest here so it would just cause corruption but might be tough to track down I think
| obstacle to adoption, that text has been removed. | ||
| Note: The picolibc submodule uses sparse checkout to exclude | ||
| GPL/AGPL-licensed test and script files that are not needed for | ||
| building. Only BSD/MIT/permissive-licensed source files are included. |
There was a problem hiding this comment.
Just wanted to surface this. We should include it in the PR description I think
There was a problem hiding this comment.
Where is the sparsity pattern that makes this actually happen?
There was a problem hiding this comment.
dblnz
left a comment
There was a problem hiding this comment.
Great work, Tomasz!! This is one hell of a diff and I like it 😆
src/hyperlight_guest_bin/c/clock.c
Outdated
There was a problem hiding this comment.
If this is really necessary, I think @jprendes just wrote a Rust implementation for this in hyperlight-js. maybe we could use it?
src/hyperlight_guest_bin/c/clock.c
Outdated
|
|
||
| extern void _current_time(uint64_t *ts); | ||
|
|
||
| int gettimeofday(struct timeval *__restrict tv, void *__restrict __tz) { |
There was a problem hiding this comment.
Thought: If we were running bindgen on picolib headers, then we could easily implement this one in rust instead of C. IIUC, the main issue is the definition of timeval, clockid_t, and timespec in Rust, that we don't want to get wrong.
There was a problem hiding this comment.
This file is no longer in the diff.
There was a problem hiding this comment.
If we were running bindgen on picolib headers, then we could easily implement this one in rust
That would be correct yes, but I would rather much have a single c file defining the interface or redefine types in Rust with correct layout. It's hard to justify running the whole bindgen machinery for few u64.
|
Is the licensing approval finally done for this, making it ready for review/merge? |
|
|
||
| #define GUEST_SCRATCH_SIZE (0x40000) // default scratch size | ||
| #define MAX_BUFFER_SIZE (1024) | ||
|
|
||
| #define printf_f(fmt, ...) \ |
There was a problem hiding this comment.
Does it make sense to see if we can either build picolibc in a configuration without stdio buffers, or to automatically do a flush on the guest-function-call-complete path?
I feel like the buffered-I/O-by-default regime will be very confusing to users---where, for example, if you are doing the "call a function and then restore the sandbox" pattern might often not see any output at all.
There was a problem hiding this comment.
Yeah, I think we can disable posix console feature: https://github.com/picolibc/picolibc/blob/main/doc/build.md#stdio-options
Fuzzy on the details but IIRC that will change to char by char sending to host which I thought was insane.
There was a problem hiding this comment.
Yeah, that makes sense... char-by-char is definitely unfortunate. Flush-on-guest-function-return would be nice if we could coalesce the flush exit and the function-return exit, but I'm not sure how easy that is?
There was a problem hiding this comment.
As Jorge suggested: #831 (comment) we can perhaps add flushing in next PR?
|
|
||
| #[unsafe(no_mangle)] | ||
| pub extern "C" fn _current_time(ts: *mut u64) -> c_int { | ||
| let bytes = call_host_function::<Vec<u8>>("CurrentTime", Some(vec![]), ReturnType::VecBytes) |
There was a problem hiding this comment.
Similarly, I think I might like to see this fail at least by default? As with my comments on #1173, I'm against exposing a clock, especially a wall clock, to the guest by default.
There was a problem hiding this comment.
This is very common dependency, for example spidermonkey crashes immediately if we don't provide some time interface. Do we really want to error or at least return an unix epoch?
There was a problem hiding this comment.
Interesting... I didn't realise so many things were using wall-clock time. Do you know what we were doing for spidermonkey before? I don't think we did actually have the real time?
I wouldn't be against having it always be the epoch personally (at least by default), but I'll let others chime in here as well.
There was a problem hiding this comment.
at some point we had a fallback where we took some epoch (2000?) and added 1s or 1ms for each call
| static CURRENT_TIME: AtomicU64 = AtomicU64::new(0); | ||
|
|
||
| #[unsafe(no_mangle)] | ||
| pub extern "C" fn read(fd: c_int, buf: *mut c_void, count: usize) -> isize { |
There was a problem hiding this comment.
Where are the fds referenced in these functions allocated? I don't see an open, so are these always either 0/1/2 for stdin/out/err? I think it might make sense to make the behaviour a bit more limited (maybe reads from fd0 always return eof; writes to fd1 are HostPrint, and writes to fd2 are debug_print, or something like that?)
There was a problem hiding this comment.
That's almost what you have here, right? except that:
- fd2 also goes to
HostPrintinstead ofdebug_print(I didn't knowdebug_printwas a thing!). - fd0 first tries
HostRead, if that's not present returns-1and setserrno = EIO(instead of EOF, which would be returning0and noerrno) - any other fd is
EBADF
So, do you suggest these changes?
- fd2 to use
debug_printinstead ofHostPrint - fd0 to return
0(and untouchederrno) in case of noHostRead
There was a problem hiding this comment.
I don't care much about the fd2---it was just the first thing that sprang to mind.
Unless we have a compelling use case, I think I would prefer for read from fd0 to unconditionally be eof, without probing for a host function which seems a bit surprising and magical. If we ever have C code that needs to be able to read() from stdin, we could add some hyperlight_guest_bin API to let a little Rust wrapper code install its own read function.
There was a problem hiding this comment.
I'm happy to go with that logic. @jprendes any objections?
There was a problem hiding this comment.
you can call fflush(NULL) and that flushes all open FDs.
Can we leave buffering and add that on VM exit (or similar) in an upcoming PR?
There was a problem hiding this comment.
also I think our current _putchar impl can go away, right?
There was a problem hiding this comment.
I got rid of function probing and replaced current_time with unix epoch increasing 1s each call -- this should be enough to get some sort of ordering working. The read implementation now return eof immediately. The behavior is documented in docs/picolibc.md
As Lucy said if we need to give user control over this we can expose hooks in the guest-bin crate.
cacc739 to
2fbc721
Compare
dblnz
left a comment
There was a problem hiding this comment.
Great work, Tomasz!
I have nothing else to add.
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
3680b58 to
20195de
Compare
Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
cargo install already skips the installation if the tool is the newest version and updates otherwise. Signed-off-by: Tomasz Andrzejak <andreiltd@gmail.com>
01f8cb1 to
4daa971
Compare
ludfjig
left a comment
There was a problem hiding this comment.
I think this is awesome!
|
|
||
| if cfg!(windows) { | ||
| unsafe { env::set_var("AR_x86_64_unknown_none", "llvm-ar") }; | ||
| // TODO: this is wrong, it should generate config file to out_dir/include |
There was a problem hiding this comment.
Do we have a plan to fix this?
There was a problem hiding this comment.
I'm not sure how we fix this. Polluting source tree with generated files is obviously bad but we need a stable location for c guests that have no access to hashed rust target dir.
One way of improving this would be to use the existing HYPERLIGHT_GUEST_TOOLCHAIN_ROOT mechanism for c guests. We could copy generated headers to HYPERLIGHT_GUEST_TOOLCHAIN_ROOT/include and set the env when building c guests. Then we can prepend the headers from that dir to clang calls.
I believe a proper fix would be to replace just script and cargo with polyglot build system.
| Command::new("git") | ||
| .args([ | ||
| "-C", | ||
| "third_party/picolibc", | ||
| "sparse-checkout", | ||
| "init", | ||
| "--no-cone", | ||
| ]) | ||
| .status()?; | ||
|
|
||
| Command::new("git") | ||
| .args([ | ||
| "-C", | ||
| "third_party/picolibc", | ||
| "sparse-checkout", | ||
| "set", | ||
| "**", | ||
| "!test/**", | ||
| "!scripts/**", | ||
| "!COPYING.GPL2", | ||
| ]) | ||
| .status()?; |
There was a problem hiding this comment.
I think we should check .success() on these commands too
| } | ||
|
|
||
| #[unsafe(no_mangle)] | ||
| pub extern "C" fn _current_time(ts: *mut u64) -> c_int { |
There was a problem hiding this comment.
does this need a null check too given below unsafe?
| let has = if detect_cc_feature(code)? { 1 } else { 0 }; | ||
| writeln!(file, "#define __HAVE_COMPLEX {has}")?; | ||
|
|
||
| // Static undefs |
There was a problem hiding this comment.
How did you decide whether to enable/disable some of these? I'm thinking mostly about
__IO_C99_FORMATS disabled, __IO_WCHAR enabled, __OBSOLETE_MATH disabled, __IO_LONG_LONG disabled
There was a problem hiding this comment.
And if a particular guest wants different picolibc settings, is that possible? I can imagine that some settings are specific to hyperlight itself, and are not allowed to be changed by guest, but maybe certain guests need more control?
There was a problem hiding this comment.
picolibc is using meson as build system -- most of those defines are default settings that meson generates. The defines are gathered from generated by meson picolibc.h file and compile_command.json file. You can have a look at all the switches that we enable/disable for picolibc here:
https://github.com/andreiltd/picoforge/blob/main/picolibc.just
and the docs for all the options:
https://github.com/picolibc/picolibc/blob/main/doc/build.md
There was a problem hiding this comment.
I understand, I was more curious if there some specific reason for these enables/disables since I think some are not default and since it's not immediately clear to me what the correct option should be. In addition, would it be valuable to allow guests to override these for example via environment variables?
There was a problem hiding this comment.
The non-defaults serve the hyperlight guest model: single-threaded, no TLS, no semihosting, no libc malloc since we provide rust allocator, verbose asserts for better debuggability, and some I/O performance tweaks discussed above like io buffering.
There are some knobs like wchar support that were needed to allow building baremetal libc++ against picolibc.
We can expose build settings but I would rather do that trough cargo features not env variables. That being said I'm not sure if we should proactively expose all the build tweaks or wait and see if any of this turns out to be problematic?
This patch removes custom musl implementation used by the guest and replace it with picolibc v1.8.11.
Note: The picolibc submodule uses sparse checkout to exclude GPL/AGPL-licensed test and script files that are not needed for building. Only BSD/MIT/permissive-licensed source files are included.