Logs: liberachat/#haskell
| 2025-10-27 19:08:14 | <monochrom> | Sounds like something my students tried. :) |
| 2025-10-27 19:08:44 | <EvanR> | but that's a separate issue from needing to rejecting stuff because it can't be decoded |
| 2025-10-27 19:08:49 | <haskellbridge> | <loonycyborg> it was easy to do under DOS |
| 2025-10-27 19:08:52 | <bwe> | loonycyborg: That's exactly the point. When you work with data not under your control, you get a mess. Like non-utf-8 stuff in utf-8 data. |
| 2025-10-27 19:09:01 | <haskellbridge> | <loonycyborg> because it doesn't even have +x attribute |
| 2025-10-27 19:09:15 | <haskellbridge> | <loonycyborg> and running a C file like that hung entire PC :P |
| 2025-10-27 19:09:16 | <dminuoso> | EvanR: Well, you can always do a lenient decode of course. |
| 2025-10-27 19:09:27 | <dminuoso> | I favour lenient UTF8 decodes in most my software. |
| 2025-10-27 19:09:31 | <EvanR> | if you don't control the input, then there's no guarantee you can communicate, end of |
| 2025-10-27 19:09:37 | <haskellbridge> | <loonycyborg> That for sure sparked my interest in building software though |
| 2025-10-27 19:09:40 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 246 seconds) |
| 2025-10-27 19:10:03 | <EvanR> | the HTML policy of produce something just never fail ever |
| 2025-10-27 19:10:07 | <EvanR> | is insane |
| 2025-10-27 19:10:21 | <haskellbridge> | <loonycyborg> bwe: though in case of JSON in particular it's required to be UTF-8 by own definition |
| 2025-10-27 19:10:42 | <EvanR> | tell that to the data he's getting |
| 2025-10-27 19:10:56 | <EvanR> | be utf-8 or else! *nothing happens* |
| 2025-10-27 19:11:05 | <monochrom> | I know that other people may disrespect your requirements. I am saying that eventually you have to draw a line and make a stand. Either that, or demand a higher price, or quit. |
| 2025-10-27 19:11:26 | <haskellbridge> | <loonycyborg> well whatever that produces "JSON" in non-standard encoding should be probably fixed too |
| 2025-10-27 19:11:48 | <EvanR> | citizens arrest. Sorry your program is produces non-compliant json |
| 2025-10-27 19:12:02 | <EvanR> | that'll be $500 |
| 2025-10-27 19:12:49 | × | wbrawner quits (~wbrawner@static.56.224.132.142.clients.your-server.de) (Ping timeout: 255 seconds) |
| 2025-10-27 19:12:58 | → | wbrawner joins (~wbrawner@static.56.224.132.142.clients.your-server.de) |
| 2025-10-27 19:13:14 | <bwe> | EvanR: How do you deal with it if you still need to process the data? I mean, not having the liberty to just reject the data as being non-compliant. |
| 2025-10-27 19:13:34 | → | Typosit joins (b41a81e702@2001:bc8:1210:2cd8::494) |
| 2025-10-27 19:13:38 | <haskellbridge> | <loonycyborg> it entirely depends on source |
| 2025-10-27 19:13:41 | → | hellwolf joins (~user@b3d5-ec31-ba5a-9945-0f00-4d40-07d0-2001.sta.estpak.ee) |
| 2025-10-27 19:13:43 | <monochrom> | I have described my XYZ solution for that. |
| 2025-10-27 19:13:44 | → | elenril joins (~elenril@tutturu.khirnov.net) |
| 2025-10-27 19:13:45 | → | infinity0 joins (~infinity0@pwned.gg) |
| 2025-10-27 19:13:47 | → | nek0 joins (~nek0@user/nek0) |
| 2025-10-27 19:13:51 | → | synchromesh joins (~john@2406:5a00:2412:2c00:80f9:f3a2:4980:7e12) |
| 2025-10-27 19:13:52 | <haskellbridge> | <loonycyborg> if you know real encoding you can feed it to iconv first |
| 2025-10-27 19:14:04 | → | Beowulf joins (florian@2a01:4f9:3b:2d56::2) |
| 2025-10-27 19:14:23 | <haskellbridge> | <loonycyborg> or haskell function to recode it |
| 2025-10-27 19:14:54 | <EvanR> | bwe, there's three situations it seems: it's utf8, it's latin-1 (weird), or it's something else. So you succeed in two cases and fail in the other? |
| 2025-10-27 19:15:13 | <EvanR> | but, logically, can you distinguish between utf-8 and latin1 always? |
| 2025-10-27 19:16:07 | <haskellbridge> | <loonycyborg> https://github.com/Project-OSS-Revival/enca |
| 2025-10-27 19:16:36 | <EvanR> | I see no reason to go out of your way to transcode your stuff to utf-8 if all you need to do is decode it and work with it as Text |
| 2025-10-27 19:16:47 | <EvanR> | sending it to someone else is anothe rstory |
| 2025-10-27 19:16:51 | <tomsmeding> | EvanR: if it's valid utf-8 and we have a prior that it's either utf-8 or latin1, it's exceedingly likely it's utf-8 |
| 2025-10-27 19:17:04 | <EvanR> | it's a probability? Dx |
| 2025-10-27 19:17:36 | <tomsmeding> | well, yeah, any byte sequence is valid latin1 so to say anything useful there you have to estimate probabilities :p |
| 2025-10-27 19:17:45 | <monochrom> | EvanR, I propose transcoding because I think that it fits aeson well. |
| 2025-10-27 19:17:57 | <tomsmeding> | transcoding is only a problem if you have performance issues with that |
| 2025-10-27 19:18:41 | <EvanR> | tomsmeding, I see what you mean now. And yeah, that's awful and why I was criticizing the situation |
| 2025-10-27 19:19:04 | <tomsmeding> | yeah I was supporting what you were saying :p |
| 2025-10-27 19:19:06 | <EvanR> | if you fail to utf-8 decode and fall back to latin-1, now you are liable to be working with nonsense (not latin-1) |
| 2025-10-27 19:19:18 | <tomsmeding> | yeah |
| 2025-10-27 19:20:14 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 19:20:29 | × | wbrawner quits (~wbrawner@static.56.224.132.142.clients.your-server.de) (Ping timeout: 250 seconds) |
| 2025-10-27 19:20:46 | → | Googulator86 joins (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) |
| 2025-10-27 19:20:50 | × | Googulator20 quits (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) (Quit: Client closed) |
| 2025-10-27 19:23:43 | <monochrom> | OK OK aeson has [either/throw]decodeStrctText, you can do your own utf8-or-latin1 decoding to Text, then give it to aeson. |
| 2025-10-27 19:24:16 | <bwe> | monochrom: That's exactly what my approach is. |
| 2025-10-27 19:24:49 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 250 seconds) |
| 2025-10-27 19:25:25 | → | wbrawner joins (~wbrawner@static.56.224.132.142.clients.your-server.de) |
| 2025-10-27 19:27:24 | → | Frostillicus joins (~Frostilli@pool-71-174-119-69.bstnma.fios.verizon.net) |
| 2025-10-27 19:27:43 | → | Tuplanolla joins (~Tuplanoll@91-159-187-167.elisa-laajakaista.fi) |
| 2025-10-27 19:35:55 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 19:39:01 | × | koz quits (~koz@121.99.240.58) (Ping timeout: 264 seconds) |
| 2025-10-27 19:39:07 | → | qqe joins (~qqq@185.54.23.200) |
| 2025-10-27 19:39:49 | → | ljdarj joins (~Thunderbi@user/ljdarj) |
| 2025-10-27 19:41:18 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 260 seconds) |
| 2025-10-27 19:51:34 | → | Jackneill joins (~Jackneill@94-21-95-10.pool.digikabel.hu) |
| 2025-10-27 19:51:43 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 19:55:17 | → | koz joins (~koz@121.99.240.58) |
| 2025-10-27 19:56:34 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 246 seconds) |
| 2025-10-27 19:59:22 | × | trickard quits (~trickard@cpe-55-98-47-163.wireline.com.au) (Ping timeout: 246 seconds) |
| 2025-10-27 19:59:36 | → | trickard_ joins (~trickard@cpe-55-98-47-163.wireline.com.au) |
| 2025-10-27 20:01:48 | AlexZenon_2 | is now known as AlexZenon |
| 2025-10-27 20:05:46 | × | Googulator86 quits (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) (Quit: Client closed) |
| 2025-10-27 20:05:48 | → | Googulator79 joins (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) |
| 2025-10-27 20:07:29 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 20:10:52 | × | Frostillicus quits (~Frostilli@pool-71-174-119-69.bstnma.fios.verizon.net) (Ping timeout: 255 seconds) |
| 2025-10-27 20:12:27 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 260 seconds) |
| 2025-10-27 20:14:34 | → | Zemy_ joins (~Zemy@2600:100c:b046:9707:64c6:42ff:fe67:7876) |
| 2025-10-27 20:15:24 | → | humasect joins (~humasect@dyn-192-249-132-90.nexicom.net) |
| 2025-10-27 20:16:36 | × | lortabac quits (~lortabac@mx1.fracta.dev) (Read error: Connection reset by peer) |
| 2025-10-27 20:16:50 | → | lortabac joins (~lortabac@mx1.fracta.dev) |
| 2025-10-27 20:16:59 | × | Fijxu quits (~Fijxu@user/fijxu) (Quit: XD!!) |
| 2025-10-27 20:17:27 | × | Zemy quits (~Zemy@72.178.108.235) (Ping timeout: 244 seconds) |
| 2025-10-27 20:17:36 | × | sajith quits (~sajith@user/sajith) (Remote host closed the connection) |
| 2025-10-27 20:17:58 | → | sajith_ joins (~sajith@user/sajith) |
| 2025-10-27 20:19:18 | → | Fijxu_ joins (~Fijxu@user/fijxu) |
| 2025-10-27 20:21:14 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 20:24:48 | × | humasect quits (~humasect@dyn-192-249-132-90.nexicom.net) (Remote host closed the connection) |
| 2025-10-27 20:25:38 | → | Googulator28 joins (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) |
| 2025-10-27 20:25:52 | × | Googulator79 quits (~Googulato@2a01-036d-0106-03fa-d161-d36f-e0e5-1b0a.pool6.digikabel.hu) (Quit: Client closed) |
| 2025-10-27 20:28:13 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 264 seconds) |
| 2025-10-27 20:31:03 | → | Zemy joins (~Zemy@2600:100c:b046:9707:7c26:54ff:fe94:28b9) |
| 2025-10-27 20:31:03 | × | Zemy_ quits (~Zemy@2600:100c:b046:9707:64c6:42ff:fe67:7876) (Read error: Connection reset by peer) |
| 2025-10-27 20:35:46 | → | jmcantrell joins (~weechat@user/jmcantrell) |
| 2025-10-27 20:39:17 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-27 20:40:07 | <haskellbridge> | <Morj> I just tried stack's nix integration. It's convenient |
| 2025-10-27 20:40:32 | <haskellbridge> | <Morj> What's not convenient is that now 'stack run' has to enter the shell every time which takes like 10 seconds |
| 2025-10-27 20:41:25 | → | notzmv joins (~umar@user/notzmv) |
| 2025-10-27 20:44:32 | → | Zemy_ joins (~Zemy@mobile-107-80-206-62.mycingular.net) |
| 2025-10-27 20:44:43 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 264 seconds) |
| 2025-10-27 20:46:48 | × | Zemy quits (~Zemy@2600:100c:b046:9707:7c26:54ff:fe94:28b9) (Ping timeout: 256 seconds) |
| 2025-10-27 20:51:54 | <haskellbridge> | <sm> ack |
All times are in UTC.