Logs: liberachat/#haskell
| 2021-08-02 08:30:40 | <merijn> | kuribas: Storable lets you just do arbitrary pointer offsets |
| 2021-08-02 08:31:07 | <kuribas> | merijn: but the whole idea of my construction is to **not** calculate pointer offsets. |
| 2021-08-02 08:31:12 | <kuribas> | Well, not manually. |
| 2021-08-02 08:31:19 | <merijn> | kuribas: Sucks to be you then |
| 2021-08-02 08:31:27 | <kuribas> | heh, what? |
| 2021-08-02 08:31:30 | <merijn> | You can try c2hsc or hsc2hs |
| 2021-08-02 08:31:36 | <merijn> | And pray to god it happens to work |
| 2021-08-02 08:32:08 | <merijn> | kuribas: You cannot automatically figure out field offsets |
| 2021-08-02 08:32:22 | <kuribas> | merijn: you can? It's just summing the offsets. |
| 2021-08-02 08:32:36 | <merijn> | It's not possible to correctly. There's a bunch of "best effort" attempts like c2hsc, etc |
| 2021-08-02 08:32:40 | <kuribas> | a TH function can do that. |
| 2021-08-02 08:32:44 | <merijn> | kuribas: And who defines those offsets? |
| 2021-08-02 08:32:47 | <kuribas> | I do? |
| 2021-08-02 08:33:32 | <kuribas> | merijn: or are you trying to say that C offsets aren't well defined? |
| 2021-08-02 08:33:48 | <merijn> | kuribas: They aren't, because the sizes aren't well defined |
| 2021-08-02 08:34:21 | <kuribas> | merijn: the size of Int8 and Word16 seems pretty well defined to me ;-) |
| 2021-08-02 08:34:33 | <merijn> | kuribas: Well, no, because those aren't C types |
| 2021-08-02 08:34:49 | <kuribas> | sure, so I'll use a typedef with defined sizes? |
| 2021-08-02 08:35:05 | <sergio812> | If it's a C/C++ data structure, sizes aren't well defined. |
| 2021-08-02 08:35:07 | <merijn> | kuribas: And then you gotta account for padding |
| 2021-08-02 08:35:20 | <merijn> | kuribas: Compilers can (and do!) insert padding bytes into structures |
| 2021-08-02 08:35:24 | <dminuoso> | merijn: You have a bunch of things that transitively end up doing things like `sizeOf (undefined :: a)`, even though there's values of type `a` around. So you can't trivially handle things like VLAs |
| 2021-08-02 08:35:37 | <dminuoso> | at least not while using Foreign |
| 2021-08-02 08:35:42 | <kuribas> | merijn: so you cannot cast a ptr in C to a struct? |
| 2021-08-02 08:35:54 | <dminuoso> | And it's annoying because it's solveable. |
| 2021-08-02 08:36:00 | <merijn> | kuribas: The way c2hsc works is that it generates your struct as C code, generates code that prints out byte offsets for each field and then inserts those into the hsc code |
| 2021-08-02 08:36:13 | <merijn> | kuribas: Because that's the only way it will ever interoperate with C |
| 2021-08-02 08:36:21 | <merijn> | kuribas: Depends on what pointer |
| 2021-08-02 08:36:25 | <kuribas> | merijn: well, luckily I don't even need to go to C if I use this... |
| 2021-08-02 08:36:27 | <sergio812> | But if it's a binary format stored on files, the C/C++ implementation probably use typedefs that guarantee(TM) you'll have consistent sizes |
| 2021-08-02 08:36:39 | <sergio812> | but they may not be known from Haskell land... |
| 2021-08-02 08:37:01 | <kuribas> | merijn: I was of the impression that people cast ptrs read from binary files to C structs... |
| 2021-08-02 08:37:08 | <merijn> | kuribas: Yeah, morons do |
| 2021-08-02 08:37:31 | <dminuoso> | 10:35:43 kuribas | merijn: so you cannot cast a ptr in C to a struct? <- depends. In simplified terms the C standard defines that you may not access an object through a pointer of an incompatible (i.e. different) type. |
| 2021-08-02 08:37:34 | <merijn> | kuribas: "casting ptrs read from binary files" makes no sense, tbh |
| 2021-08-02 08:38:13 | <kuribas> | merijn: of course you have to deal with byte ordering. But in my case I assume the byte ordering stays the same... |
| 2021-08-02 08:38:14 | <merijn> | You can have a ptr to bytes read from a file and you can (well, imagine some very big scare quotes around that "can") cast that pointer to something |
| 2021-08-02 08:38:32 | <merijn> | kuribas: Struct padding is different from ABI to ABI |
| 2021-08-02 08:38:36 | <kuribas> | merijn: as it's only for storage, not for distribution. |
| 2021-08-02 08:39:07 | <kuribas> | merijn: aren't there pragmas for dealing with padding etc... ? |
| 2021-08-02 08:39:16 | <merijn> | kuribas: The only people who dump out struct by "serialise bytes from a pointer" are people who want to hate themselves 3 years from now |
| 2021-08-02 08:39:41 | <merijn> | kuribas: You can adds lots of things to make dumb things seem somewhat more reasonable :p |
| 2021-08-02 08:40:11 | <merijn> | kuribas: tbh, my first reaction would be to question your very initial assumption that you cannot afford "binary" because it copie |
| 2021-08-02 08:40:13 | <kuribas> | So in that case, my haskell solution will be faster than C :) |
| 2021-08-02 08:40:22 | <merijn> | Why do you think the overhead of copying will be at all relevant? |
| 2021-08-02 08:40:37 | <merijn> | i.e. how big is your data? how sparse is your accessing? how often do you load stuff? |
| 2021-08-02 08:41:09 | <kuribas> | merijn: it's chunked time series data. First I look up the timeseries range in the index, then the index points to another buffer with the actual data. |
| 2021-08-02 08:41:14 | × | reumeth quits (~reumeth@user/reumeth) (Ping timeout: 256 seconds) |
| 2021-08-02 08:41:59 | <merijn> | kuribas: I mean, binary has a "skip" combinator |
| 2021-08-02 08:42:22 | <merijn> | I don't really see how that indirection would really form a problem if you have an accurate byte offset |
| 2021-08-02 08:42:40 | <kuribas> | merijn: accurate byte offset is the problem... |
| 2021-08-02 08:42:43 | <merijn> | Your binary parser doesn't really have to parse your entire format |
| 2021-08-02 08:42:58 | <merijn> | kuribas: Well, you said you could manually define it |
| 2021-08-02 08:43:44 | <merijn> | tbh, the question is still ill-specified as it is unclear whether this is 1) an pre-existing well defined format or 2) a format you're defining and implementing yourself |
| 2021-08-02 08:43:54 | <kuribas> | merijn: the latter |
| 2021-08-02 08:44:16 | <merijn> | Then you can just define it as whatever is convenient |
| 2021-08-02 08:44:43 | <kuribas> | merijn: btw, I wrote a library for reading font formats using binary. |
| 2021-08-02 08:44:48 | <kuribas> | It isn't all that clean... |
| 2021-08-02 08:45:26 | <kuribas> | lot's of manual byte fiddling... |
| 2021-08-02 08:45:40 | × | eggplantade quits (~Eggplanta@108-201-191-115.lightspeed.sntcca.sbcglobal.net) (Remote host closed the connection) |
| 2021-08-02 08:47:02 | × | econo quits (uid147250@user/econo) (Quit: Connection closed for inactivity) |
| 2021-08-02 08:47:56 | <kuribas> | merijn: in any case, binary and storable solve a different problem from the one I solved above, which is calculating offsets in a safer way. |
| 2021-08-02 08:48:10 | × | fossdd quits (~fossdd@sourcehut/user/fossdd) (Ping timeout: 240 seconds) |
| 2021-08-02 08:48:32 | <merijn> | If you are defining a format yourself you can define the format to make the offset easy and just do that? |
| 2021-08-02 08:48:40 | → | fossdd joins (~fossdd@sourcehut/user/fossdd) |
| 2021-08-02 08:48:41 | <merijn> | I don't understand the problem in that case |
| 2021-08-02 08:49:04 | <merijn> | "How do I calculate offsets?" 'well, I dunno man, that depends on how you defined your format...' |
| 2021-08-02 08:49:26 | <kuribas> | I mean, I know how to calculate offsets, but I'd prefer the compiler to do it for me... |
| 2021-08-02 08:50:12 | <merijn> | You have to define your format somewhere anywhere, so just define the sizes there and then you can just compute it? |
| 2021-08-02 08:50:31 | <merijn> | I literally don't understand the problem you're trying to describe |
| 2021-08-02 08:50:55 | <merijn> | If you've defined the format, you already have the sizes defined somewhere, so you can just, like, sum them in a function and done? |
| 2021-08-02 08:51:18 | <kuribas> | "getField bs WriteLogHead" is more descriptive than "getFieldAt bs 16" |
| 2021-08-02 08:52:30 | <merijn> | So define a variable "writeLogHead" whose value is the sum off all the preceding fields? |
| 2021-08-02 08:53:03 | <kuribas> | or like, don't, and let the compiler do the sum? |
| 2021-08-02 08:53:15 | <kuribas> | why would I want to manually sum byte offsets? |
| 2021-08-02 08:53:59 | → | drd joins (~drd@93-39-151-19.ip76.fastwebnet.it) |
| 2021-08-02 08:54:08 | <merijn> | Well, if you wanna write TH code the loops over record fields, inspects their types, computes sizes from that, and sums them, I'm not stopping you |
| 2021-08-02 08:55:11 | <merijn> | I mean, it'll take you 3-5 days of hating yourself to figure out the TH code, it will compile slower, no one else on the team will understand it and it'll mess up your compile times *and* ability to cross-compile, but if that seems less effort than "writing out the offsets once in a file"...nobody's stopping you |
| 2021-08-02 08:56:00 | <mastarija> | So, I'm looking at this graph of typeclasses on typeclassopedia, https://wiki.haskell.org/File:Typeclassopedia-diagram.png |
| 2021-08-02 08:56:08 | <mastarija> | And it says "if there is an arrow from Foo to Bar it means that every Bar is (or should be, or can be made into) a Foo" |
| 2021-08-02 08:56:16 | <mastarija> | But that doesn't check out |
| 2021-08-02 08:56:22 | <merijn> | mastarija: How so? |
| 2021-08-02 08:56:25 | <mastarija> | Arrows should be in the opposite direction |
| 2021-08-02 08:56:33 | × | azeem quits (~azeem@dynamic-adsl-94-34-48-122.clienti.tiscali.it) (Ping timeout: 258 seconds) |
| 2021-08-02 08:56:36 | <mastarija> | Oh... |
| 2021-08-02 08:56:38 | <merijn> | mastarija: No |
| 2021-08-02 08:56:43 | <mastarija> | No, I read it wrong |
| 2021-08-02 08:56:47 | <merijn> | mastarija: :) |
| 2021-08-02 08:56:50 | <mastarija> | xD |
| 2021-08-02 08:56:58 | × | hnOsmium0001 quits (uid453710@id-453710.stonehaven.irccloud.com) (Quit: Connection closed for inactivity) |
| 2021-08-02 08:57:09 | <mastarija> | But I do think arrows should be reverse :D |
| 2021-08-02 08:57:13 | <merijn> | mastarija: Effectively the arrows are "this is a subset of" |
| 2021-08-02 08:57:33 | <mastarija> | Yes, but my intuition is "Superset" |
| 2021-08-02 08:57:33 | <merijn> | mastarija: I think there's something to be said for both arrow directions |
| 2021-08-02 08:57:54 | <merijn> | mastarija: In inheritance you usually point from the superclass to subclass too |
| 2021-08-02 08:58:10 | <merijn> | So it seems fairly natural to have arrows from superset to subset |
| 2021-08-02 08:58:20 | → | azeem joins (~azeem@176.200.220.247) |
| 2021-08-02 08:58:42 | <merijn> | It's also how hierarchies are normally represented |
| 2021-08-02 08:58:43 | <mastarija> | Yes.. I guess it kind of depends on the context you're coming from |
| 2021-08-02 09:00:44 | → | reumeth joins (~reumeth@user/reumeth) |
All times are in UTC.