
Talking Technicians
Talking Technicians
S06-E03 Special Episode: Cross-over with Trouble Shooting Technicians
This special episode is a crossover episode with the Troubleshooting Technicians Podcast. Troubleshooting Technicians is a podcast about how technicians troubleshoot and solve problems in their work. This episode features Sean from Pfeiffer Vacuum. Sean describes general troubleshooting steps and then goes into a specific example of how to troubleshoot a vacuum system. Give the Troubleshooting Technicians podcast feed a follow!
The Talking Technicians podcast is produced by MNT-EC, the Micro Nano Technology Education Center, through financial support from the National Science Foundation's Advanced Technological Education grant program.
Opinions expressed on this podcast do not necessarily represent those of the National Science Foundation.
Join the conversation. If you are a working technician or know someone who is, reach out to us at info@talkingtechnicians.org.
Links from the show:
Episode Web Page:
https://micronanoeducation.org/students-parents/talking-technicians-podcast/
Troubleshooting Technicians on Apple Podcasts: https://podcasts.apple.com/us/podcast/troubleshooting-technicians/id1835766744
Troubleshooting Technicians on Spotify: https://open.spotify.com/show/165fyRtUJqS3L2GBtoo6E0
Pfeiffer Vacuum: https://www.pfeiffer-vacuum.com/us/en/
Peter Kazarinoff 0:02
From MNT-EC, the Micro Nano Technology Education Center, this is Talking Technicians. The podcast about technicians: who they are, what they do, and where they come from. I'm your host, Peter Kazarinoff. I teach technicians and engineers at Portland Community College. In each episode, you'll meet a working technician and hear their story. That means real interviews, with real technicians, about real jobs. At the end of each episode, you'll hear actions you can take if you want to be a technician too.
In this special episode of the Talking Technicians Podcast, we're having a crossover with the Troubleshooting Technicians Podcast. Troubleshooting Technicians is a podcast about the vacuum industry and how to do troubleshooting when you're working as a technician. So it's adjacent to the work that we do here at the Talking Technicians Podcast, and we wanted to be able to highlight one of their stories in this feed. So after you listen to this episode, check out Troubleshooting Technicians.
In this episode, you'll meet Sean. Sean works for Pfeiffer vacuum, and has quite a bit of experience with troubleshooting. So, Sean, thank you so much for coming and being able to talk with me about troubleshooting.
Sean 1:30
My pleasure. Thanks for having me.
Peter Kazarinoff 1:32
So, Sean, there's sort of a general set of steps when you're talking about troubleshooting, and in particular, troubleshooting vacuum systems. So first, how about we talk about what those steps are, and then after that, maybe we can sort of imagine a sample problem, and then how we can do like that troubleshooting strategy. So if somebody's doing some troubleshooting with a vacuum system. What are the general set of steps that you would advise that they go through?
Sean 2:09
Well, you know, what is the problem, defining what the the issue is is often the it is the very first step. Why are you concerned? Why are you even, quote, in troubleshooting mode? To do that, sometimes it is also good to have baseline of what is normal and good, so that you know when you have a situation that isn't good. You know, so So characterization of your systems, this is where it really helps you in understanding things, so in troubleshooting,
Peter Kazarinoff 2:50
Right - So once you've identified that problem, then what's the next thing a technician should think about?
Sean 2:57
So usually the identification of a sometimes the identification of a problem is based upon data itself, but you're going to need to collect some information. So for example, if you know your chamber normally gets down to one to the minus three, and it's only getting to one to minus one, well there's your data, and that data alone is your trouble statement, my chamber is not achieving the pressure that it's supposed to, so, you know, but there's a lot more data to be to be had here. When was the last time it was working properly? What has happened in that period of time? So, you know, it's not unusual for Hey, the chamber was pumping down, great, before the we did the PM, and after the PM, it's not pumping down. Well, that would lend someone to assume that something happened during the PM,
Peter Kazarinoff 3:56
Righ - So, So Sean, we've collected this data, and also we have to know, like, what was the previous state of the system and when was it running good? Then what should a technician do next, after they've got all this data in hand?
Sean 4:11
Well, now you're going to be kind of looking at what you're going to be hypothesizing now you know what are, are the different things that could cause this. And as normal, lazy people, we generally look at what's most likely or most common, right? So if, for example, if this Pm was to open up chamber, disconnect components x and y, and then reassemble. And you have a base pressure problem. You might want to look at, you know, those components and those joints, right? Maybe they were leaking, and it takes. Experience of your system to know that now in some cases, for example, in a computer chip factory, where you got the chamber upstairs, and it's often maintained by one set of technicians, and a different set of technicians are often down in the sub fab, and after the pm done up in the FAB, the technicians call the pump techs, and they say, your pump has a problem. It was working fine before our PM, and now it's not. I've had that statement many a time, and the first thing I would do is, if I have an isolation valve, isolate the valve above the pump, and get a base pressure reading of the pump between the valve and the pump, basically find out the pump is guilty or not. In this situation, as I've described it, nine times out of 10, the pump is fine. It's pumping down to one to the minus three, or whatever it's supposed to be. And now we know that there's a problem above that valve, and it's often in the chamber after a PM. So at this point, the reasonable things would be either to very if it's quick to do take apart certain components that were last taken apart, you know, make sure that things are clean, that the it's put together properly, and all that stuff. Or if, and if those quick and easy ones aren't, you know, giving you good fruit there, then you probably, in this situation, you pick a leak detector and hook up leak detector and start the checking a leak is the most probable. But we also have to understand that there's other things we could have a virtual leak, like maybe one of our MSc is leaking, and things like that. So there's, there's other possibilities, but 90% of the time, most of us technicians, we go with what hits 90% of the time.
Peter Kazarinoff 6:56
So Sean, we've analyzed our data, but what happens if we're not able to come up with a potential solution. What happens if not enough data exists? We don't know enough about the problem, what should we do then?
Sean 7:11
So the problem state, you have a problem statement, and the potential solutions that you can think of, you can't think of any potential causes of that. Basically, you're scratching your head, right? You're like, I have no idea. It should be working. It isn't. There's no explanation at that point. Escalation to other people is your primary tool, bringing in other people who have more experience than you, or different experience can can be helpful in the or even going outside your organization. Maybe it's the equipment manufacturer or people like that to understand what's going on. If, if everyone is scratching their heads still, you need to broaden your data your data collection. You need to get a broader net and pull in more data. It's rare that I've gotten to that point where, where you know, you know, even bringing in the quote experts on your system, like manufacturers or other people, doesn't get you a path of, okay, go check this data, this data, this data. Go check this, this and this, or give me this, these readings so, but yeah, that would be my next step.
Peter Kazarinoff 8:36
So Sean, if we then do have enough data to come up with an idea for a solution, like, what we're going to implement to try to fix this system. How do we go about testing that solution? I have an idea of like, what the problem is, but now I have to test and make sure that that's the thing that's actually causing the problem, right?
Sean 8:59
So generally speaking, you would be doing either the repair or place of a component. So let's go back to to, you know, the example of not achieving good pressure. I have a theory that the flange or whatever was not connected properly. I could look at it. I look at it and I look it visually. Looks like it's not connected properly. So even those KF flanges will be a low, off kilter sometimes. And so, you know, I would take it apart, put it back together, and then, and then just pump down and test it. So that'd be a simpler way of doing it. Sometimes you do a toggle on, toggle off, type thing where the failure mode is something you could turn on and turn off, and then, and you do that, and you go, I, when I, when I? So, for example, if it was a suspected an MFC of leaking, and there's a downstream valve between the MFC and the chamber, I would shut that valve, and if the pressure goes back down, then I would open that valve back up, and the pressure goes back up, shut that valve. I'm toggling on and off, and that's a very clear indication that, yep, that MFC is flowing gas and it shouldn't be
Peter Kazarinoff 10:22
So Sean, what about if we implement this solution, but actually the solution that we implement that's not the problem that doesn't work? What would be the next step that a technician should think about?
Sean 10:35
Okay, so for example, I go change that, reattach that, KF fitting, and then we pump back down. We're still having issues, still not basing out. Don't rule out that there are multiple failure modes, especially in leaks. So it is possible that instead of getting down to one to the minus three, because we were at one to minus one, let's say originally, we get down to one to one of the minus two. Okay, so we've solved part of the problem. Well, that probably indicates to me that there's just other joints that I had taken apart that had been taken apart during the maintenance. So don't rule out that you don't have multiple things going on, but if so, if it did not fix the problem, what you probably did initially was you from your data, you go, okay, here are the things that could have could be my problem, and you, when you you went to number one, right? Okay, most likely it was this. Well, obviously, if it's not that, then you just start working down the list in order of probabilities. Probabilities are through reasoning, and they're also determined through just experience, you know, you know, hey, we find that, you know, this power supply is kind of a flaky power supply, so it often goes out, you know, that's just experience so, so, yeah, those would be, that'd be my next step. So you basically your as your flow diagram and puts, you know, you just go back around. You just keep feeding this loop until you get the desired result of you tested and it passed,
Peter Kazarinoff 12:18
Right - So we've developed this new solution, and then we test this new solution. What are some ways that a technician can evaluate those results and know whether that solution is the thing that fixes the problem or not? In some ways this seems sort of obvious, but how would we also relate that back to our initial data gathering as well. How would a technician evaluate those results?
Sean 12:44
Yeah. Well, I mean, yeah. So you go back to your problem statement, and you should have a quantifiable, hopefully you got a quantified problem statement. Pressure should be one to the minus three. It's currently one to the minus one. You implement your fixes. What is the pressure? Is pressure one to the minus three again, right? That's that's pretty simple, but it's not always simple, right? So intermittent are the most frustrating things to to do, because you can't just get an instant toggle on, toggle off, result, right? So there are cases where you're going to have intermittent So to define, to to help you with that, is going back to the definition. So what is the problem? Well, once out of every 10 runs, I have X problem. Well, what that really means is you got to go more than 10 runs before you can prove that you've solved the problem. So those, those are the frustrating ones to deal with. And you know, in troubleshooting those intermittent, yeah, you go, okay, most likely it's this. Let's say we're doing more of electronic you know, this relay, you know, is most likely, but this relay is intermittently causing issues, so you change it, and then you wait for that period of time.
Peter Kazarinoff 14:27
So Sean, what are some sort of key lessons learned, or like key takeaways about troubleshooting based on all your years of experience working with vacuum systems?
Sean 14:39
The first thing is, especially as you get more experienced, you are going to, you get lazy, and you use the procedure less, the troubleshooting procedure, and you just go, oh, okay, I know this. It's going to be this, right? We go to our shortcut if, if the last five times the system has had. A problem, and replacing widget X fix the problem. And you see the similar symptom, you just go, okay, replace x. You don't go through, let's collect data. All this. That's, that's, that's a laziness that I think all of us get get to. We just assume that. And I see that even in my regular everyday life, right? We when we see the same, similar situation, we assume the same cause every time. So, and that's not necessarily a bad thing, but as soon as that you change that widget x and it doesn't solve it, that should be a big warning to you, or indicator to you that you need to stop and go back to the process and use the process.
Peter Kazarinoff 15:51
So Sean, let's maybe think about like one specific scenario and see if we can quickly go through our problem solving steps for how a technician might analyze it. So let's imagine that we're working in a semiconductor foundry or a semiconductor factory, and we've got a vacuum system that contains vacuum pump, it contains a flange, it contains Gage, and that vacuum pump is connected with an electronic system. How would you go about troubleshooting the fact that what you can see through that Gage is that you've got a vacuum leak or your pressure isn't low enough. Let's go through those problem solving steps.
Sean 16:41
Okay, so it kind of kind of, I was already kind of doing similar scenario, because it's a very common scenario in vacuum. So first thing is, the gage is reading a certain pressure, high pressure, as I said, the first thing I would want to do is, because we have complex and semiconductor fab, we got complex systems that that pump is pumping on many components, is to split the system in half. If I can either close the valve, or if the valve isn't there, I get permission to turn off the pump cap it above a certain point, and then get a get a reading. Okay, now at this point, if, if I get, you know, good, good pressure, well, then I know it's above that, and then either that's someone else's responsibility, or it's my responsibility, and I have to go looking more into why is that pressure? What's that problem? I've got a got a flow rate, basically, either virtual leak or real leak that it was not expecting. It's what it is. The pump is good. Now, if the the pressure is still high, even though I am isolated above the pump and got the gage in between two things, pretty much two things. It could be to two to three. I could have a leak in that very short piece of piping. It's not highly likely, but it's possible. I could have a bad Gage, or I could have a bad pump at this point, if I have any reason to first suspect a leak, I could just put a gage directly on the inlet of the pump. Or if I suspect the gage, I could swap the gage. If that's a quicker, easier way of doing it, I'll do that. And then once I've rolled those out, then it's the pump. And generally speaking, especially in in the fab world, you just yank the pump and put a new one in, and then deal with that pump later. Either you try to troubleshoot it a little testing you have in your sub fab, or you just send it off for repair.
Peter Kazarinoff 18:56
And so Sean from doing troubleshooting and working with vacuum systems a lot. One thing that technicians can come up against is just time, and the other is resources. How long should a technician sort of be thinking about that these vacuum troubleshooting steps are going to take, and what kind of resources do they need to be able to have at their disposal to be able to find solutions?
Sean 19:28
Well, you know, the timetable they have is dependent upon the factory right, that that tool, how quickly does that tool need to come up? Right? Could be a hot tool. Could be a single, single point tool, right? Could be the only one in the FAB, right? So knowing what the whip is behind and all that, and yeah, it's usually going to be by talking to the production supervisors, and they can kind of tell you how urgent and how quick that is. So what that comes. Down to for making your decisions and troubleshooting is, generally speaking, do I yank and replace the pump? Right? How soon? First thing you have to understand, you have to figure out is, is it likely the pump so again, we need to make sure that it's not a leak. We need to make sure that it's not a bad Gage, because you could spend that hour to yank and put in a new pump, and you'd be back at square one. So it's a wasted hour. So, but once we've determined that, and if we can then work with production to go okay, I think there's some things maybe I can possibly troubleshoot and fix on the pump. It's possible. Do I have an hour or two to do that? If you do, then you could be saving a lot of time and research, because once you yank it, that pump is pretty much, especially if it's got process residue in it, is going to go to rebuild, because it could seize up. Literally, some pumps will seize up within five minutes of cooling off, because of the processes chemistry in there, you know, PE, CVD and and metal, etc, are big ones. So you do need to, you know, understand that having having a good spare pool is critical. I don't know of any fab that functions it does not have a readily available spare pole for every model pump they have. So having that ready is a very important resource, and it's up to production and new to understand. You know, is it yanking and replace and don't worry about the $70,000 rebuild cost, or spend a little time see if you can fix it now in situ, it's going to depend upon the specifics of the situation.
Peter Kazarinoff 21:50
So Sean, one other thing that I wanted to ask about before we finish up today is sort of about data collection reporting. So when somebody is troubleshooting, what kind of things do they need to make sure that they're documenting and that they're able to pass down to a technician that comes on to the next shift, or to a supervisor or another technician working with them?
Sean 22:19
Well, pretty much everything in that troubleshooting process needs to be written down. You know, you know when, when things what? For example, what the the pressures were before, when we when did we notice? When was the last time we noticed it was good? When was the first time we noticed was bad? What steps you've already gone through. So they're not recreating and redoing everything. And even, you know, if you think about it, sometimes we don't in the moment, but documenting what your assumptions were, you know, you know, did you assume that the MSCs were good, you know, or what you didn't check, or just, you know, what I did check. And then, if you already have some, let's say you're stopped in the middle of the troubleshooting, and you've got three more things that you were planning on checking, documenting that going, hey, the next things I was planning on checking was B, C and D, you know, so giving a thorough pass down to where you were, what you've collected and what your next things to check are is definitely very valuable. The next guy could look at everything up, look at B, C and D. It's obviously why, but that's up to them. So you've done your part by sharing all your information to help them out.
Peter Kazarinoff 23:53
So Sean, to finish up for today, I wanted to just ask you about based on your years of vacuum experience and all of your experience with many different vacuum components. What are some typical things that will cause problems in a vacuum system based on all your experience?
Sean 24:14
I'm not sure there really is typical. Yes. So obviously joints, you know, especially com flat, but also, you know, KF and ISO being put back together incorrectly. Or, you know, some foreign debris, or, you know, no ring that was sealing before, an old o-ring, no ceiling before, but once you took the pressure off and tried to re put the pressure back on, you know, it's, it's brittle, and it's, you know, the those are definitely things. But you know, there's such a wide variety of failure modes of. Um in semiconductor, you're dealing a lot with process contamination of the pumps, so that's not an unusual failure mode to deal with. You will have leaks, especially up in the tools, because they're just more complex. So even just doing a normal pm means you're disconnecting a bunch of things, putting back together. You got potentially dozens of MSC, s right? You got lots of lots of things going on. So leaks are obviously a major issue in in it, but, but, you know, there's that, there's that relay that goes, you know, intermittently on you because I've had a situation where a relay, when I had it, the system set up to operate the relay, basically operating every six minutes. The relay would fail in three days. And when I changed the and the relay was operating every two minutes, it failed within eight hours. And I turned it turned out that it had a bit of a short and the more cycling, the more heat generated, and then it would fail. So just by, you know. So there's always those weird things you're going to find. So there's there's things that happen more often, but there's never typical.
Peter Kazarinoff 26:31
So Sean, thank you so much for talking with me about troubleshooting, and in particular, troubleshooting vacuum systems today.
Sean 26:38
Well, it's my pleasure. Thank you for having me.
Peter Kazarinoff 26:40
Please keep in touch.
Sean 26:43
Thank you.
Peter Kazarinoff 26:53
Talking Technicians is produced by MNT-EC, the Micro Nano Technology Education Center through financial support from the National Science Foundation's Advanced Technological Education grant program. Opinions expressed on this podcast do not necessarily represent those of the National Science Foundation. Join the conversation: If you are a working technician, or know someone who is, reach out to us at info@talkingtechnicians.org. We're always looking out for great guests to share more stories with you.