WEBVTT 00:00.000 --> 00:11.440 Perfect. Hello everyone. Glad you're all here. I'm Jacob Coffee. I work at the Python software 00:11.440 --> 00:17.680 foundation. On the infrastructure. Today, we're going to talk about my fake bread business 00:17.680 --> 00:24.560 called the bakery. The best I could do. But really, we're going to talk about pet 810, 00:24.560 --> 00:28.720 which I do nothing about. But someone told me the best way to learn something is to teach 00:28.720 --> 00:37.360 it. So here we are. Hang on. We're going to cover what the problem is. The start-up time 00:37.360 --> 00:44.880 is pretty rough with Python when it imports the world. We're going to explain 810 like the 00:44.880 --> 00:50.000 syntax. Probably not alive demo because I didn't realize there would be so many people here and 00:50.000 --> 00:56.160 now you made me very nervous. So real world impact measured and then use cases beyond a silly 00:56.160 --> 01:10.080 bread business CLI tool and then how to migrate. So the problem is that when you run anything in Python, 01:10.080 --> 01:17.520 it's going to import the world. So it's eager to import model, loads everything, and we can see 01:17.520 --> 01:24.800 that it loads, it takes 234 milliseconds, which isn't great. If you've used Galer Rust, CLI tools, 01:24.800 --> 01:31.840 it's very snappy. There's also problems with memory bloat. There's also some cool start tax, 01:31.840 --> 01:37.200 which is bad when you're doing serverless things. You probably don't want to use Python for serverless 01:37.200 --> 01:44.640 because of these reasons. And then you have nasty hacks like if type checking. So what is 01:44.640 --> 01:52.480 Pep 810? It's explicit lazy imports for Python, and it's basically going to say instead of 01:52.640 --> 02:00.400 bringing in the world, we have this module over here that we only call once a year or once however long, 02:00.400 --> 02:05.600 and we don't pull it in until it's first accessed, not just when it reads the import statements. 02:06.960 --> 02:17.680 Good news, it was accepted. So it's going to be in Python 3.15. It says it's 50 to 70% faster for 02:17.680 --> 02:24.320 certain workloads. 30 to 40% less memory, which is good because we like trees, and it's a nice 02:24.320 --> 02:31.520 explicit syntax. So no surprises when it happens. You can easily document it. The syntax, 02:31.520 --> 02:39.280 normally we have just from breadcuddle, which is my bread business, it's very official, import, 02:39.280 --> 02:45.040 and we have baked delivery and inventory, and it's just going to import the world, even if you 02:45.040 --> 02:50.480 don't run anything. So even if I do CLI tool, dash dash help, it's still going to import the 02:50.480 --> 02:56.240 world, it's going to take forever. It's really annoying. But with this, the new thing, we lazy import 02:56.240 --> 03:04.480 the thing, and then you only do it when it is invoked. So that is going to be a big improvement. 03:04.480 --> 03:09.360 So if you run dash dash help, you don't access baker, delivery, or inventory, or whatever, 03:09.440 --> 03:14.720 and now it just loads only when you run the one thing. So how is it going to work? 03:16.560 --> 03:24.320 So when Python sees the lazy import, like lazy import, JSON or HTTBX, it doesn't actually import 03:24.320 --> 03:32.320 HTTBX, it's going to use proxy object, it's going to fill the void there, and this proxy goes into the 03:33.280 --> 03:39.040 system modules namespace, I think I have that right, where HTTBX import would normally be. 03:40.480 --> 03:45.040 But there's no file, I think that's correct, and then no code execution happens yet. 03:46.240 --> 03:51.600 So we parse it, we say lazy import, HTTBX, creates that proxy, then you have a lazy module 03:51.600 --> 03:58.640 proxy, then we do the waiting until we want to use it. Until you do like an HTTBX 03:59.280 --> 04:07.520 dot get or whatever module you're calling. So you start up say fast, your help commands are quick, 04:07.520 --> 04:20.240 all that. Okay, so the moment you actually use HTTBX, it does the real import, the proxy object 04:20.240 --> 04:24.800 is transparently replaced in the real module, I believe I had this correct, it's the reification 04:24.880 --> 04:30.160 process of that, and your code never knows the difference. So if it never is access and 04:30.160 --> 04:34.880 ever loads, and that's the whole trick, it sounds like it's this big thing, but it's just the 04:34.880 --> 04:44.240 super simple. So boom, HTTBX is loaded, and we're all happy. So that's the three sort of phases, 04:44.240 --> 04:48.400 very, very dumb down for someone that wanted to learn it, and then share it with you all this 04:48.480 --> 04:54.880 parse way in access phase. We do the proxy, proxy is dormant, and then we do the real import. 04:54.880 --> 05:04.720 So boom, it's not super complicated. We're going to see maybe if we can do the live demo, 05:05.440 --> 05:13.360 I'm terrified of this. Actually, look at this cow, it's in Scotland, it's very pretty. 05:14.080 --> 05:20.960 We're not going to do the live demo, like we should. Oh, how are we going? Because I did this 05:20.960 --> 05:27.440 earlier to make sure that I didn't look like a goober up here, but basically on the left side, 05:28.480 --> 05:34.640 we have, I think I flipped around one of these, yeah, okay, so left side is the normal 05:35.280 --> 05:43.280 running, and then you have the right side with this dash click, which is, you just look in the 05:43.280 --> 05:48.160 code and in the slides, I have some links. You have the same thing running, it's a little bit slower, 05:48.160 --> 05:56.000 it's like seven times, or seven point, five times faster. And then I think I used Claude to help 05:56.000 --> 06:01.840 me because I'm bad at math. We can see some like real numbers here. So 30 to 40 milliseconds, 06:01.840 --> 06:05.920 faster is what the robot says, but we don't, we don't really always trust the robot, do we? 06:08.640 --> 06:13.600 So that's all you're going to get for live demo. But yeah, you can see just from like that one 06:14.720 --> 06:20.880 four letter change, we have a much faster thing. This is a very silly demo, but you can see like, 06:20.880 --> 06:25.200 for example, I maintain or help maintain light star, which is a web framework, you could do the 06:25.200 --> 06:32.320 same thing for a flask or fast API. But when you have this ginormous app, and you do some 06:32.320 --> 06:37.760 CLI command for it, it's going to take what feels like five ten seconds to load. It's not really 06:37.760 --> 06:45.360 that long, I hope. So you have other problems. But with the lazy imports explicitly, you can now do this 06:45.360 --> 06:50.160 without hacking around it. I mean, you could lazy import things. Now you could stuff things 06:50.160 --> 06:56.560 inside of functions, so they're only called when the functions invoke. That's fine. But this is 06:57.200 --> 07:01.120 part of the language now. It's official. Nice non hacky way to do that. 07:05.840 --> 07:10.880 Okay, so we have some bitchmark results here just from my silly examples. This is going to use 07:10.880 --> 07:15.920 the the cap of CLI framework. That's just one that I like, but actually if you, 07:16.880 --> 07:23.600 it's supposed to be faster, but if you compare against CLI and some others, I think capa is not 07:23.600 --> 07:31.520 the fastest, but I like the syntax. So just for me, throwing in this lazy import, I got 23% speed up 07:31.520 --> 07:38.240 for the help command, module import time is 26% faster, and then my inventory command, which does 07:38.240 --> 07:42.240 a whole lot, did not change. So that's the thing, you're not going to always see an improvement, 07:42.240 --> 07:48.240 so it's not like you should go and just find a place import this with lazy import this, 07:48.240 --> 07:56.080 so I think that's a good good solution. Some rural world impact meta. They have their own 07:56.080 --> 08:03.280 fork of CPI fund, and then they do lazy imports and they have a 70% state of 70% start a production 08:03.280 --> 08:11.200 in 40% memory savings, which is what the pet, 18, if you look at the the docs, says is what could 08:11.280 --> 08:17.520 be expected for your own projects. Same thing for HRT, they have module level as the imports, 08:17.520 --> 08:23.520 and for PICI, they have QD bindings, 35% start improvement. So that's pretty significant 08:24.240 --> 08:30.080 for your end users or yourself, whichever. And these are not experiments, there's like rural 08:30.080 --> 08:37.840 world, ginormous companies that do cool things or not cool things, but they have big production 08:37.920 --> 08:48.080 companies using the code, so it works. So PEP 18 is not just for CLI tools, that's just one 08:48.080 --> 08:57.520 use case, it's my very silly example for that. So we have type checking, so if you ever want to do 08:59.520 --> 09:04.560 if type checking and that whole block where it's just like this ugly little block, you can just 09:04.560 --> 09:12.800 instead do lazy import from whatever type thing you're doing and type checking block goes away. 09:14.800 --> 09:22.320 I think it's an error prone and a lot of our linters yell at us for leaving things out or things 09:22.320 --> 09:28.240 that maybe should go in the type checking, so it's very confusing. So and there's also some things 09:28.240 --> 09:32.800 with like I said earlier, the serverless and cold start environment, so every land of cold, 09:32.800 --> 09:39.280 every lambda, cold start costs money and use your user patience. So if we're seeing, for 09:39.280 --> 09:48.080 example meta, solve the 50 to 70% speed up, then that's that's something that will really help 09:48.080 --> 09:55.600 save some money and make users happy for your serverless runs. And also for memory constrained 09:55.600 --> 10:03.040 environments. So these are all good things. There's a simple swap out for the things that you can 10:04.240 --> 10:14.160 identify, need this and I don't know, it's just simple, it's good. But how I do it, so I can 10:14.160 --> 10:20.240 take you through this. So you want to go through in profile, probably better than I have profile 10:20.320 --> 10:31.520 here, but profile your application, whatever it may be, look for heavy things like numpy or 10:32.160 --> 10:39.840 pandas or HTTX or anything like greater than 50 milliseconds to start with and then apply the lazy 10:39.840 --> 10:48.960 import selectively. The module level in it was going to happen on the first axis and you should see 10:48.960 --> 10:53.920 an improvement, you could test this with hyperfine. All of the benchmarks that I have done have been 10:53.920 --> 10:59.840 with hyperfine, and it's just a rest of your life tool for benchmarking. It's very fast. And then 11:00.400 --> 11:06.800 your Python can finally go, almost as fast as go. Don't quote me on that, but probably. 11:08.800 --> 11:15.760 So I'm caveat and gotchas. Import time side effects are deferred, so you may want to like 11:16.480 --> 11:23.680 know our document, that things that might configure logging in import time might need to be 11:23.680 --> 11:29.760 not lazy imported. That's the don't just blindly lazy import the world. I can't wait to see 11:29.760 --> 11:35.920 some people do that, though, and get some error reports. So other things, type checkers still 11:35.920 --> 11:43.600 need updates, I think last night, I don't know if anyone has them, UI does not, has an issue open, 11:44.560 --> 11:49.920 rough for the lansing part, and then my Python empire right, I have not checked, but I assume they do not. 11:50.640 --> 12:02.320 As also, let's see, import errors move. So if before you had some optional dependency, 12:02.320 --> 12:11.120 like I had an optional, in my project at Toml that was century, and nothing happened when I ran 12:12.080 --> 12:17.040 the app, but I forgot to do the right flag or incantation or whatever to add 12:17.040 --> 12:22.560 century as an optional dependency. You're not going to know about it until the app is running, 12:22.560 --> 12:27.920 because we're not going to get an error, it seems simple, but you just have to kind of think about 12:27.920 --> 12:34.240 these things until a century is called and then throws an error because no module phone. 12:34.240 --> 12:41.200 Like it's not always faster, so it's best for conditional and optional imports, 12:42.080 --> 12:48.160 and then some tips, just test with simple things, like your help, if you're actually running a 12:48.160 --> 12:56.320 CLI, or optional dependencies, please keep your eager imports. Don't just lazy import the 12:56.320 --> 13:02.000 things that you rely on, like logging configuration, and then document things for your co-workers, 13:02.080 --> 13:08.080 they will love you for this. So they know this thing is lazy-loaded. When this air happens, 13:08.080 --> 13:13.600 you don't have to Google around or ask Cloud because Cloud's not going to be up to date on this 13:13.600 --> 13:20.800 for a while, probably. So one thing, oh, you can't, there's some docs in the official pet also. 13:21.520 --> 13:28.400 You can't import inside functions. You can't import inside, if type checking, I don't know why you do 13:28.720 --> 13:34.480 that, but you cannot do that. We can go over some caveats actually here. 13:37.200 --> 13:41.520 Yeah, can't do inside functions. Can't lazy import inside classes, because then you kind of like, 13:42.080 --> 13:46.240 it's only going to be in the global namespace at the top, but I mean, for the reason, 13:46.960 --> 13:52.960 because these were, I guess, implicit, lazily imported. This is what people do currently to 13:52.960 --> 13:59.360 try and speed up and get around this. And then this ugly thing that people like to do, 13:59.360 --> 14:04.640 sometimes, where we import star, we should not do that. So I'm glad that this is a syntax here. 14:06.160 --> 14:18.880 So, awesome, reset me. So it's not yet available. There's a reference implementation available now 14:18.880 --> 14:26.880 through the CPI-thon through a ACPI-thon fork, and there's a link in my in the notes that I'll 14:26.880 --> 14:35.040 share out that the lazy imports come all. I think it's the GitHub organization for that. 14:35.840 --> 14:39.680 It was an api-thon, I think this is good, because an api-thon explicit is better than implicit. 14:40.400 --> 14:47.600 That's, that's the most lived by. And this manifests that for us all. And let's us make good choices, 14:47.600 --> 14:54.080 not have some footgun that, well, maybe you can footgun if you do some bad things. But for the most 14:54.080 --> 15:00.800 part, it is good. No surprises here. And it's local. It's only going to affect the one thing that 15:00.800 --> 15:07.520 you tag lazy on. So it's not going to say, mess up like import system, and then lazy import JSON. 15:07.520 --> 15:13.200 It's not going to mess up system, or any kind of example. And you have that granular level of control, 15:13.280 --> 15:21.760 which is very nice. Back on this, you can make some lazy and eager, 15:21.760 --> 15:26.960 easily freely if you were wondering. So there's no limitations around that. And then some resources. 15:26.960 --> 15:33.440 There's the PEP, PEP stuff, Python.org, PEP 810. There's this very bad demo page for breadcuddle, 15:33.440 --> 15:38.560 my very official bread business. There's the CPI-thon fork. I get Hub.com slash lazy imports, 15:38.560 --> 15:47.200 Kabbal, and then hyperfine, which I've done all my benchmarking with. That's a thank you page. 15:47.200 --> 15:57.840 But I wanted to go over, let's see. So more like in in-depth things that I'm glad 15:57.840 --> 16:03.520 that I have time for. So I mentioned a reification I glance over it, but we have this lazy object. 16:03.520 --> 16:10.480 It needs to be reified or like made real. So right now we have lazy import foo. Right now foo 16:10.480 --> 16:17.840 is just like a placeholder in the system modules. Until we call that or do something that 16:17.840 --> 16:23.440 that invokes it if you call type on foo, then that would bring it into the system modules. 16:25.600 --> 16:34.480 Here we have an example of this. We don't we don't have a man. Thank you. 16:34.560 --> 16:44.560 Here we have a this is a silly example where we are not going to have this issue. 16:45.360 --> 16:51.120 It has a typo and it incurs all in the first use. That is a bad thing. You could type 16:51.120 --> 16:57.120 other thing, lazy import it, and it's not going to know that it never exists because it doesn't 16:58.080 --> 17:04.720 technically until it reifies and it exists. So that is something to kind of think about. 17:06.880 --> 17:10.720 Perfect. Now thank you. Okay, I'm Ben. 17:17.840 --> 17:23.120 If you have any questions like I said, I did this to learn it. So I'm not going to be the expert in the room here. 17:23.760 --> 17:28.320 We do a bunch of CPI time core developers. So if you ask me, I'm probably going to point 17:28.320 --> 17:35.680 them, but question away. Okay, so you can ask the questions on the chat or just raise your hand 17:35.680 --> 17:45.520 and I'll give you the mic. Any questions? Two. Two questions from the chat. 17:46.480 --> 17:51.600 Yeah. Okay. Two questions from the chat. Okay. First question. Does it deserve the 17:51.600 --> 18:00.800 problem with secular inputs? Circular imports? Yeah. Well, I think that might still be a problem, 18:00.800 --> 18:07.680 but only a problem went when the lazy import is reified and it becomes a real object. Is that 18:07.840 --> 18:12.480 sound right? You do? Yeah. You're still going to have the circular import, but it's not going 18:12.480 --> 18:23.200 to throw the exception until that is accessed. Okay. So good question. What stops me from using lazy 18:23.200 --> 18:32.720 inputs everywhere because I'm lazy? Nothing. Yeah. You can you can try it. There are probably 18:32.720 --> 18:41.440 some workloads where that might work. I mean, my example was if you did log like logging and 18:41.440 --> 18:47.280 for some reason, you lazy imported your logging not PY file and that configured all of your logging 18:47.280 --> 18:52.720 things. We're not going to have anything done until that first call. So you might miss out on 18:52.720 --> 18:58.720 some things. I don't know if I have that quite right. This one? I'm sorry. And the back? Yeah. 18:58.720 --> 19:13.040 Can we? Oh. Where's the question? Thank you for the talk and for anybody here that 19:13.040 --> 19:17.600 contributes to the paper sounds very good. Can we expect like cascading improvements? There's 19:17.600 --> 19:22.800 really expensive imports that you suggested like pandas and numpy. Can we expect that as they put 19:22.800 --> 19:27.760 in a lazy import internally? Yeah. We'll also get improvements even if we don't lazy import those 19:27.920 --> 19:34.160 branches. Yeah. So with my example, that would be a light star. We provide a CLI with the 19:34.160 --> 19:41.440 web framework that you can call. So for that example, and for yours for when numpy they bring this in 19:42.320 --> 19:49.680 when I guess 315 is well, when 315 is out and people are only using 315 in their project. Yeah, 19:49.680 --> 19:56.400 you can expect to see improvements just by pip updating their requirements. And then, you know, 19:56.400 --> 20:03.760 they'll have the speed up rates. Following on from that, is there a way to feature tests this if 20:03.760 --> 20:08.400 you're still supporting older Python versions that wouldn't have it? Yeah. So I did this 20:10.080 --> 20:16.560 the initial work here. This is hard to talk in. But of course, you go here, already wrote a blog 20:16.560 --> 20:22.720 as soon as I finished my talk, I realized that you go ahead basically this. So he tested this 20:22.880 --> 20:31.840 basically by line. So you can bring this in to your project by adding this reference implementation 20:33.040 --> 20:43.280 line to your tools about which Python you're doing. And this will be a link in my repo for this 20:43.280 --> 20:50.960 talk. And basically here we're saying that I'm not pip on 315. I'm pip on 314 because if you try this 20:51.040 --> 20:55.040 now and do like rough length or rough check or whatever, it's going to yell at you because they don't 20:55.040 --> 21:03.280 support 315. And so you compile this special fancy version, then you can pip install the packages 21:03.280 --> 21:11.120 or your project and then you can test it yourself. You go also did some hyper-fine benchmarking 21:11.120 --> 21:18.240 and then went on to do some fully lazy benchmarking. It's all even greater importance performance. 21:18.240 --> 21:24.400 So almost three times faster. So you can test it now. This is not very easily. 21:28.400 --> 21:35.200 First, thank you to all the people who actually did contribute to this. I have a question 21:35.200 --> 21:40.080 which is about the overhead. Let's suppose that you replace all your imports by the lazy import. 21:41.040 --> 21:50.240 Like the gentleman on the chat set, would that have any drawback? Like would there be any overhead? 21:50.240 --> 21:55.120 Let's suppose that it works, right? I don't think there's any run time overhead from 21:55.120 --> 22:03.120 doing it, no? I think there's like zero, right? Yeah, I'm pretty sure it's zero when time over 22:03.120 --> 22:08.640 head on that. Okay, and also for this circular imports, I might imagine that it would delay 22:09.280 --> 22:16.880 discovering that, oh, I have a bug like much later into the runtime. Yeah. Yeah, so it like it could be bad. 22:16.880 --> 22:24.240 That's why you should selectively do the lazy imports instead of doing a final place and then 22:24.240 --> 22:31.920 massively replace it in your whole project. You have more questions? Yeah, out of time. Unfortunately, 22:32.080 --> 22:36.960 if you have more questions, then please just comment and then talk to Dr. Director at Dr. Yeah. 22:36.960 --> 22:42.880 Or put it maybe into the chat. Thank you all. Thank you!