I like to resume with the posts on the topic of FAB cycle time drivers. As mentioned in an earlier post – these are some of the key drivers for factory cycle time:
- overall factory size (number of equipment available to run a certain step)
- overall equipment uptime and uptime stability
- rework rate
- product mix
- number and lengths of queue time restricted steps in the process flow
- lot hold rate and holt times
- degree of automation of material transport
- degree of optimization in the dispatching / scheduling solution
I covered the factory size topic already – so here are a few thoughts on equipment uptime. I think everybody knows and agrees that equipment uptime is a very key parameter. Equipment Uptime has direct implications on the factory capacity and even in the most simple, Excel based capacity planning model, you will find a column for planned equipment uptime. But of course the impact on capacity is only one aspect.
One could ask: Capacity – at what factory cycle time ? Based on the earlier discussed operating curve, a FAB has different capacity at different cycle times. Here comes the equipment capacity into the picture. The ideal FAB achieves the planned uptime on each tool group also in real life.
But what does it mean, if a tool group achieves its planned uptime ? We need to look a bit closer. Over what period of time does the tool group achieve for example 90% ? The capacity planners typically assume an average uptime number and somewhere in the fine-print you can find if this is meant to be for a 1 week, 4 weeks, 13 weeks or another time frame. For the real FAB these timeframes of interest are usually much shorter – if a few key tools are down right now for let’s say 2 hours – that might create already a lot of attention.
A deviation from the average planned uptime has the potential to impact the FABs cycle time. Assuming that the incoming WIP to a tool group is somewhat constant over time (which is already an optimistic assumption) higher or lower average uptime will result in higher or lower effective tool utilization and that means the wait time of lots will be different:
If we zoom in a bit more, tool groups with the same average uptime might have different impact on the lots wait time based on how the day to day, shift to shift or even hour to hour uptime looks like.
Below are 3 “constructed” uptime day to day cases to illustrate that.
Tool group A has every single day 10% downtime (red) and 90% uptime (yellow) – no big surprise that the average uptime is 90%
Tool group B has alternating days with 80% or 100% uptime – which will result in the same 90% average uptime for the full time frame
Tool group C has a very different down time pattern, but the average will be again 90%. To make you believe it – take visually all the red blocks more than 10% and fill them into the 100% uptime days and you get the picture from tool group A)
If your capacity planning team is using average uptime values for capacity planning, these 3 tool groups are treated exactly the same. For static capacity planning purposes this will be fine, but if you like also to calculate/estimate/forecast the overall factory cycle time, these 3 tool groups will very likely impact the WIP flow differently and therefore the lot cycle time will be different as well.
Once this point of general understanding is reached the obvious next questions are:
1. Which uptime pattern is better for my factory A, B or C ?
(better as in: enables more stable and lower cycle time)
2. How do I change the not so good ones to look more like the best one ?
I will discuss this a bit more in the next post.
It has been a while, but summer in upstate NY is too beautiful to not enjoy it as much as possible. Downside is (at least for the blog) – I spend less time on the computer to write new posts.
So instead posting about Factory Physics and Automation topics, I do more of this:
Later in fall there will be more time again for writing – for the time being just a very short post today.
There were 2 very nice articles written, referring to earlier posts from my blog and I think they are well worth the time reading:
Enjoy the summer !
to get automatically notified about new posts:
Last week I had the honor to present a keynote at the 19th Innovation Forum for Automation in Dresden, Germany. After 2 years of virtual conferences, this year it was a full in person event again.
Here are the slides of my talk:
update: here are the video recordings of both days ( these are 3-5h videos):
I did not discuss the results of the last poll yet. This post will focus on that.
Unfortunately, not a lot of readers did participate. The data is statistically more on the weak side, but I think the outcome is in line with what I was expecting:
It seems that 70% of the voters use a rather simple method to define the maximum allowed tool group utilization. This matches with what I have experienced in a lot of FABs.
Given the massive implications the FAB capacity profile has on the FAB cycle time it is surprising, that in todays heavily data driven world not more advanced methods are used. I wonder what is the reason for that ? Here is some speculation from my end:
- not enough resources to manage the significant amount of data
- input data quality for capacity planning is limited due to grouping of products and averaged assumptions
- real factory performance data is highly dynamic and hard to forecast for the next 2…3 months
- planning scenarios change so frequently, that a more detailed planning takes more time than the next scenario ask rolls in
- decision makers are used to simple rules like flat 85% – since they have worked for the last 20 years to some extend and more advanced methods are “black magic” and capital intense decisions will be not based on “black magic”
- FAB cycle time is more a high level target, the operations/engineering department needs to figure out in the daily business how to get the cycle time down
- or maybe the FABs which use more advanced methods simply did not vote here
I would love to hear feedback on these topics.
Over the next weeks my posting activity will slow down a bit due to a lot of travel on my side. One special highlight is coming up with my visit in Dresden, Germany to participate at the
where I will have the honor to give a talk. Check it out here (LINK) and maybe we can meet in Dresden in person . After the event I will post the slides here.
I recently attended the Advanced Semiconductor Manufacturing Conference (ASMC) in Saratoga Springs, NY and listened to a very interesting presentation.
Bill Wiseman from McKinsey & Company spoke about the future of the US semiconductor ecosystem and a few fundamental challenges which will have significant impact.
Bill is a longterm insider of the semiconductor business and I got his permission to post his slide deck here:
Happy Easter everybody !
Today only a very short post – since I’m discussing here normally how semiconductor FABs work in terms of cycle time and output, I thought why not have a quick look inside a FAB ?
This post was triggered by a recent YouTube post of an Intel FAB walk, which nicely explains how a modern FAB looks like. Please see below a few links to see inside a FAB:
Intel in Israel: LINK
Intel in US: LINK
Bosch in Germany: LINK
Micron in US: LINK
GlobalFoundries in Singapore: LINK
TSMC in Taiwan: LINK
Vishay (200mm) in Germany: LINK
Infineon (200mm) in Germany: LINK
Bosch (200mm) in Germany: LINK
I received and interesting comment to one of the older posts:
The topic is indeed very interesting. Most modern semiconductor processing equipment come with 4 load ports. Maine reason for that is to ensure the process chambers can be utilized as much as possible and do not have idle time because of exchange of lots. A simple 3 Chamber tool with 4 load ports might look like this:
Individual wafers will be removed from the carrier on the load port and travel through the various equipment modules depending on the actual process recipe sequence. An example is shown below:
Let’s assume there are lots with 5 wafers processed on this tool. The wafer will sit in different chambers at different times. Below is a simplified example, which ignores the time a wafer spends in the transfer chamber for handling.
The reason why tools have more than 1 load port is illustrated in the picture below. For example: let’s assume load ports 2, 3 and 4 are down and only load port 1 can be used.
Lot 2 can only be loaded after lot 1 has finished and was unloaded from load port 1. This will lead to idle process chambers:
Having more than 1 load port available will allow to load the next, while the 1st lot is still processing and therefore these chamber idle times can be prevented, since the 1st wafer from lot 2 can be processed immediately after wafer 5 from lot 1:
To load lot 3 early enough – to prevent chamber idle times between lot2 and lot 3 – load port 1 is again available:
At least in the example above, 2 load ports would be more than enough to keep the process chambers busy all the time. Why do equipment vendors deliver most of the tools with 4 load ports ?
How many load ports are really needed depends on a lot of factors. In my basic example above I ignored most of them. As always the devil is in the detail, but here are some factors which influence the need for more load ports:
- number of process chambers on the tool
- process times of the individual chambers
- time a wafer spends for transfer between chambers
- wafer flow logic through the tool (serial, parallel)
- load port down times
In my experience most of the tools with lot processing times greater than 15 … 20 minutes can easily be fully utilized with 2 load ports, since there is enough time to transport the next lot to the tool, while the current lot is still processing.
But let’s go back to the comment which initiated this post:
I completely agree, that having always all load ports loaded will lead to higher wait times. Here is the theory behind that. To illustrate the effect I will use a simple FAB with 4 process steps, running on 4 different process tools. Each process tool has 4 load ports and the lot process time is 1h on each step. Lots have 25 wafers each:
In this scenario – with all 4 load ports always loaded – these would be the factory performance data:
Let’s look at a second scenario, which only has WIP on 2 of the 4 load ports:
Since there is now significantly less WIP in the factory, the overall factory cycle time is much faster – at the same FAB output:
Here is my take on this: Having multiple load ports on a processing tool is for sure very beneficial since it will enable maximum possible equipment utilization. Having all tool load ports loaded with lots all the time is definitely in most cases not needed to achieve maximum factory output and clearly a sign of a relative slow factory. As a matter of fact, one can easily estimate the overall factory X factor by this logic:
Average number of lots waiting per tool equals the FAB X factor.
This ignores that lots might have different number of wafers and different processing times for different products at different steps, but if a FAB has always all 4 load ports on all tools loaded with lots and possibly 2 more lots waiting in stockers, this FAB will run not faster than an X factor of 6.
Another interesting fact is that there are a few tools (mostly with very fast processing times) where even 4 load ports are not enough to always feed the tools with wafers fast enough.
A last statement on the topic of load port utilization: I have seen in multiple cases that the manufacturing departments use load port utilization as a metric – mainly with the interpretation that idle load ports are “bad”.
I think this is driven by the general desire to have tools fully utilized and have enough WIP on the tool for the next hours, so even in case upstream tools have a problem or lot transportation is slow, the tool group of interest can still process “full steam”
In one of the earlier blog posts (LINK) I received interesting feedback on what are “acceptable” FAB cycle times. The results showed big differences and I think this is mainly based on voters professional experience. There are a lot of factors which influence a factories capability to achieve a certain cycle time. If we assume 2 factories are running the exact same technologies and process flows – but have different actual factory cycle times – the difference will not come from the process times of the lots, but mainly from different wait times. Key drivers for wait times are:
- overall factory size (number of equipment available to run a certain step)
- overall equipment uptime and uptime stability
- rework rate
- product mix
- number and lengths of queue time restricted steps in the process flow
- lot hold rate and holt times
- degree of automation of material transport
- degree of optimization in the dispatching / scheduling solution
Let’s dig into a few of them in more detail.
One of the biggest drivers for factory cycle time is the size of the factory itself. The key reason for this is that processing equipment does not have 100% uptime. In a very simple way: If a factory would have only 1 equipment to process a certain step and the equipment is down there is no path for the lots and they have to wait until the equipment is back up. If there is more than 1 equipment available, lots have a path to progress and there will be less waiting time. This effect can be seen very well with the help of operating curves:
Having more than one tool available to run lots will massively reduce the average lot wait time, if all other parameters of the tools are the same. Everyone in manufacturing knows this effect and for that very reason avoids having these “one of a kind” situations.
It also can be seen, that the effect going from 2 to 3 tools is smaller than going from 1 to 2 tools. I think a golden rule in capacity planning for semiconductor FABs is: “… avoid one of a kind tools as much as possible – or plan with very low tool utilization for these situations …”
For example if there is no way around a one of a kind tool and you still need to achieve cycle times around a X-factor of 3 – in the given setting – the maximum allowed tool utilization would be 44% !
The real interesting thing here is that of course each tool set’s operating curve is shaped differently and in order to understand the impact on the total factory cycle time, one needs to know and understand the operating curves of all tool sets in the factory. A second aspect to keep in mind is: How many times will a lot come back to a tool set – since this has big impact on the factory cycle times as well.
Example: The operating curve shows a X factor of 3 and the processing time at the tool group is 1 hour, which means there will be 2 hours of average wait time for each lot. Here is the impact on the overall factory cycle time based on the numbers of passes (number of times this tool set in in the flow)
Look at the last column – the impact can be massive !
Having all these effects in mind, I think it is easy to feel “relatively safe” if a FAB has at least 3 or 4 tools available for each process step. In my opinion it is a much better situation than having 1 or 2 tools available, but often the “name plate capacity” number of tools available in a capacity planning model is one thing.
How many tools or chambers are really available (and not inhibited, temporary disqualified or for other reasons not used) is often a different picture. Also, based on my experience, typically the actual number of tools available on the floor is seldom bigger than in the capacity (and cycle time) planning model.
improvement potential: Frequently check the real available number of tools vs. your capacity plan !!!
Back to the statement “… having 3 or 4 tools available is relatively safe …” – the positive effect of having more tools is of course also there at greater number of tools. As in the picture below, a 4 tool tool set runs nicely at an X factor of 3, but look what happens to cycle time if we would have significant more tools:
This is the real reason, why the big players in the semiconductor wafer FAB business build MEGA FABs. Having a very large number of tools in parallel allows to run at very fast FAB speeds and still utilize the expensive tools much higher. Now take into account that tool pricing also will be lower – if a FAB orders 10 tools instead of 2 or 3 – the whole thing becomes really desirable.
Of course, building a very large FAB requires a lot of upfront capital and also the expected demand needs to be big enough to fill a bigger FAB, but if you have the choice and are in doubt – always GO BIG, it will pay benefits for many years to come.
To close this post – I’m curious how the industry is dealing with the effects described above – specifically for planning purposes. In your capacity planning model: How do you define the maximum allowed tool utilization for a tool group ? I’m assuming that 100% planned utilization is not a legit assumption to avoid extremely high FAB cycle times. How is this modeled based on your experience ?
Thank you everyone who participated in the last poll. Participation was significantly down beside the fact, that there were plenty of post viewers. My interpretation is that the readers are not too sure about the actual value of 1 day of cycle time. This observation is also in line with my personal experiences from working in semiconductor wafer FABs. It seems like that everybody acknowledges that fast cycle time is a good thing and it would be valuable to work on that – but what the actual value is – there is no clear understanding. The results of the poll itself look accordingly:
the same data sorted by the $ value:
The majority of voters pointed towards a few hundred thousand dollars but 33% said it is a million dollars or more !
I think one of the reasons why the real value of cycle time is not clearly defined is the missing of an accepted and standardized model how to calculate or at least how to estimate. I have seen a few different approaches from very simple to very complex – and what is worse – different models will generate different results, which does not really help to build confidence in the numbers.
One very simple model is the following:
If we look at our factory operating curve and assume we are running at our voters favorite operating point:
800 wafer starts per day at a X factor of 3 (or 60 days cycle time)
we can extrapolate the value of 1 day of cycle time by the following logic:
- 800 wafer starts per = 800 x 365 = 292,000 wafers per year
- 292,000 wafers per year times $1,150 selling price = $335.8 million revenue
If we now use the factories operating curve and “look” to the left and right of the current operating point we can do a very simply estimation of the value of 1 day of cycle time:
Since the operating curve is non linear there is a difference if we look towards lower or higher utilization – but if we assume only small changes around the current point – we can ignore this.
Towards higher utilization:
plus 50 wafers starts will lead to 20 days more cycle time or in a simple ratio:
2.5 wafers per 1 days of cycle time.
2.5 wafers x 365 days x $1,150 = ~ $1 million revenue
Towards lower utilization:
minus 100 wafers starts will lead to 20 days less cycle time or in a simple ratio:
5 wafers per 1 days of cycle time.
5 wafers x 365 days x $1,150 = ~ $2 million revenue
This is a big difference between the 2 numbers – but even if we use the smaller one to be on the safe side – $1 million is a serious number. Keep in mind, all the other benefits of faster cycle time are ignored in this simple model.
Another model – significantly more complex – which takes into account:
- revenue gain due to faster cycle time versus falling selling price
- revenue gain due to faster yield learning
It was developed by professor Robert Leachman. He teaches this method at the University of California, Berkeley. The complete coursework can be found here: LINK
I will not dig in more into the “fascinating world” of models to calculate the value of cycle time – instead will discuss a bit more the practical application of the value of speed.
Clearly the value of speed depends also on the overall market situation. In very high demand situation customers might be willing to tolerate higher cycle times, if they just can get enough supply. Factories tend to start more wafers in these conditions and simply “cash in”.
Still if the engineering team could implement measures to reduce the factory cycle time by lets say 1 day – management could “use” this gained 1 day of FAB capability to start a few more wafers – driving the FAB back to the previous speed, but deliver more wafers = more revenue.
In this scenario the question is:
1 day of cycle time is worth $1 million. How much will be the engineering team allowed to spent to enable this 1 day of cycle time reduction ?
This comes down to how ROI is handled in the company – but there is a path to calculate this. It will enable dollar spending for cycle time and this is based on a model – which will support decision making – if a measure or change is worth implementing or not.
In my next post I will start discussing a few more details around the operating curve and most important – what can be done to improve.
Today only a very short post !
Very interesting poll results ! Of course the poll left a lot of things open to free interpretation and assumptions, but as expected voters had different opinions. Here is the feedback chart:
The same data in the context of the operation curve:
If I ignore the outliers at the 250 and 500 wafer starts per day mark the largest group of voters would trade
1 X factor of FAB speed for 100 additional wafer starts per day
and start 800 wafers per day instead of 700. Lets try to convert this into more understandable numbers:
1 X factor = 1 raw process time – for a “typical FAB” this could mean anything between 10-25 days of FAB cycle time. On the other hand, what do +100 wafer starts per day mean financially ?
Let’s assume there is an average profit of $150 per wafer – an additional 100 wafer start per day would add up to 100 x 365 = 36,500 more wafers per year or $5,475,000 more profit per year. This seems like an absolute no brainer.
How about an additional $5.5 Million dollar profit if we go to 900 wafer starts per day or $11 million additional profit ? Now the cycle time penalty looks very different. We would pay with 4 X factors or 40 – 80 days more FAB cycle time – still a no brainer ?
Here is a table for how this would look – assumption FAB raw process time = 20 days
I think it comes down to the famous question:
What is the value of FAB cycle time – the value in $$ ?
Based on the table above it seems like shorter FAB cycle times are not really desirable, but at some point customers will turn away to order wafers from someone else – if lead times between order placement and actual wafer delivery are too long …
Experienced wafer FAB practitioners know that shorter overall factory cycle times can have a lot of positive effects:
- faster learning cycles to improve yield
- lower overall FAB WIP – lower overall inventory cost
- lower overall FAB WIP – lower risk of excursion impact
- faster detection of possible process issues
- faster reaction capability on demand / product mix changes
- likely better on-time delivery for low volume products
The question really is: Until which point is it more beneficial to run the FAB faster (lower cycle time) since the benefits of being fast or outweighing the benefits of higher profit due to higher FAB output ?
I can not resist to put up another poll here to see, what the readers think is the value of 1 day of shorter (or longer) FAB cycle time. For our example factory above – running at these parameters:
- 800 wafer starts per day
- X factor of 3 or 60 days total FAB cycle time
- about 47,500 wafers of total FAB WIP
- 98% manufacturing yield (wafer scrap based yield)
- 90% die yield (electrical yield)
- cost per wafer of $1,000
- selling price per wafer of $1,150
- selling price more or less stable (chip shortage driven)
- high product mix in the FAB (greater 250 different products, running on more than 100 routes)
I can’t wait to see the results. To give a few more readers a chance to vote, I will keep this poll open for about 3 weeks – so the next post will be some time end of February.
Reflecting on the wide spread of acceptable wait times and therefore acceptable FAB cycle times from the poll results, I was wondering: Why do people have these different opinions. I think it has to do with the actual factory conditions, the individual voters have experienced in their professional careers.
To have fast cycle times is an obvious goal, just how fast is “good” or possible ? The expectation must be influenced by real world experience, else everyone would have voted for the “less 30 minutes” bucket.
This leads to the question: Why do different FABs have different cycle times or different X factors ?
Absolute FAB cycle times are of course depending also on the raw processing times (RPT) of the products in the factory. For example:
RPT wait time cyle time X factor Factory 1 10 days 20 days 30 days 3 Factory 2 20 days 40 days 60 days 3
If just looked at the absolute cycle time – it seems that factory 2 is much slower, but in terms of how much wait time compared to the processing time (aka X factor) both factories perform similarly.
To really be able to “judge” or compare Fab speeds , cycle times or X factors need to be normalized to the overall factory loading or factory utilization. To explain why this is important I will use the picture of a 3 lane highway.
Imagine you use this highway for your daily commute to work and lets assume these basic data:
- distance from your home to work = 30 miles
- speed limit on the highway = 60 miles per hour
- you are not driving faster than the speed limit
- your “raw driving time” = “raw processing time” = 30 minutes
- the highway (the factory) is everyday the same, it has 3 lanes and a speed limit of 60 mph
Let’s try to answer this question: How long does it take to get to work?
I think everybody will agree, that there will be very different driving times (cycle times) for the different days and times – all happening on the exact same highway (factory). The difference is the utilization of the highway. Now lets assume the same highway, but we throw in a lane closure – which actually means the highway has now reduced capacity:
The table below shows some assumed drive times (think cycle times):
The point of this example is, that the drive time on the highway is depended on how much the highway is utilized. Also important, the highway capacity has impact on the highway utilization and therefore on the drive time as well.
If we plot the data points in a chart it will look like this:
If we translate this picture into a semiconductor wafer FAB, there are a few interesting points to note:
- the FAB itself has a certain capacity
- the capacity of the FAB will not be stable if things like number tools, tool uptime or product mix change
- the utilization of the FAB is a result of a decision – how many wafers to start
- the very same factory can have completely different cycle times depending on the FAB utilization
I personally think this behavior, which is famously know as the operating curve, is one of the biggest challenges in the semiconductor manufacturing world (assuming that process stability and yields are under control )
Each FAB has such a curve describing the factories ability in terms of what average cycle time can be achieved at which utilization level. Very important: the operating curve of different factories are extremely likely different (the shape of the curve)
The factory operations management team has “only” 3 tasks here:
- to know how the FABs operation curve looks like ( aka: what cycle time can be expected at which fab loading or utilization level)
- make a decision, how many wafers to start to achieve a desired cycle time and FAB output level
- execute daily operations and constantly improve the factories operating curve
To close today’s post, I like to ask again for your input. If you would be the FAB manager of the factory below, what would be your wafer starts decision – assuming you have enough orders to even start 1000 or more wafers per day ?
Results will be discussed in the next post.
I like to be open – I could not resist to use the trendy “chip shortage” term to generate some interest. Everything I will discuss in this post series is of course fully applicable even in times without a chip shortage.
Let’s start with the results of my last poll:
The spread of the answers is bigger than what I did expect to see, but it makes sense to some extent. Let’s chart the same data in a different way, sorted by the wait time buckets:
What this means is: For the same assumption on “fully loaded FAB” wait times between below 30 minutes and up to greater 4 hours are seen as acceptable. Let this sink in …
How does it impact FAB performance ? It will result in significant different total factory cycle times.
In order to illustrate that, let me put a few assumptions down to estimate what these wait times really mean:
- about 80% of all steps of a product flow typically fall into the category “processing time 30 – 60 min.”
- the remaining 20% of the steps are shorter or longer – let’s assume it will average out to 30- 60 min. as well
- for the estimation I set the 30 – 60 minutes range to a fixed 45 minute processing time
The cycle time of a single step in the product flow will be always calculated as (ignoring any lot on hold times):
Based on that we can easily calculate the cycle time of a step, given different wait times. For the wait times from my poll it would look like that:
Another very common indicator to measure and compare cycle time is “X factor”.
Here is the definition of “X factor”:
The same cycle timetable from above now including the X factor:
The true implication of the differences in what is an acceptable wait time comes to light if we scale this up to full factory level. For illustration purposes let’s assume the following FAB parameters:
- typical products have 40 mask layers
- average of 15 steps per mask layer or 40 x 15 = 600 steps in the flow
- basic assumption of 45 minutes average process time per step (as discussed above)
With these input parameters the total acceptable cycle time of this FAB would look like this:
Different factories with different “acceptable wait time” assumption would have multiple months different cycle times for the same type of product.
I’m very sure, that FAB management with actual 80 days cycle time would really love to get down to 50 or 40 days – not to talk about 30 days. The magic question is: How ?
In my next post I will start looking into that.
Happy New Year !
This will be the last part of the Bottleneck discussion. As mentioned in part 3 – I think the most objective and telling indicator to see what is the true factory bottleneck is:
highest average lot wait time at a tool group
Wait time or cycle time in general is one of the very few indicators which can not be easily manipulated or “adjusted” by using different methods of calculation or aggregation. Time never stops and measuring the time between a lot arrives logically at a step and it starts processing at the step (on an equipment) are 2 simple time stamps which are typically recorded in the MES of the factory. For example:
lot arrived at the step: 01/02/21 4am
lot started processing: 01/02/21 10am
The wait time of the lot is super simple –> 6 hours.
The beauty of this metric is that no other information is needed – just these 2 time stamps. It will cover any possible reason why the lot waited 6hours – no matter what:
- equipment was not available due to down time
- equipment was not available since it was busy running another lot
- lot was not started due to missing recipe
- lot was not started due to no operator available
- lot was not started since operator chose to run another lot
- lot was not started due to too much WIP in time link zone
- lot was not started due to schedule had it planned starting at 10am
- lot was not started due to … “name your reason here”
One key part of the FAB Performance metrics – as discussed in part 2 – is:
- deliver enough wafers in time –> customer point of view –> cycle time of the FAB
In other words once the decision was made to start a lot into the factory it has some kind of target date/time by when this lot needs to be finished or shipped. Any wait time is by nature now a “not desired state” especially if the wait time is “very long”. That means tool groups which generate the highest average lot wait time will be very likely the biggest problem or bottleneck.
Let’s have a look at some example data to illustrate that:
The chart above shows the average lot wait times per step of our complete factory. Some steps have 1h wait time others have up to 6h.
Since this chart shows the data by step in the order of the route or flow it does not tell immediately which tool groups are the various steps running on.
The same data – including the tool group context – will tell this better:
If we now aggregate and sort this by tool group instead of step we have our bottleneck chart:
From this chart tool group 7 clearly has the greatest average lot wait time of all tool groups. An interesting version of this chart is the “total wait time contribution” chart which shows the sum of the individual step wait times.
For example tool group 7 has 3 steps in the route and on average a lot waits on each step 6h. If we plot the same data as “total wait time contribution chart” we will not average the wait time of the individual step but add them: Tool group 7 will show 6h + 6h + 6h = 18h of total wait time for each lot.
Note that the sort order of the tool groups is now different. For example tool group 1 which on average has the lowest wait time (1h) is now ranked as number 4. From an overall “is this tool group a problem for the factory ?” point of view I say no – since lots barely waiting there – it just happens that tool group 1 has a lot of steps in the flow. I strongly lean to the average chart for the overall definition of the FAB bottleneck but recommend always to have a look on the cumulative chart as well.
In part 2 of the Bottleneck blog series I discussed the “Factory Utilization Profile chart”. I think this chart enhanced with the wait time data from above will give the “complete view” what is going on in the factory and will spark enough questions to dig in deeper at the right tool groups.
The chart below shows the data sorted by the highest average cycle time:
Obvious question is: Why is there so much wait time on tool group 7 at such low utilization or asked differently: half of the time the tool group is idle – why do lots wait on average for 6 hours ?
Or another one: How is tool group 1 able to achieve such low wait time ?
At this point I like to stop for a second and point you to an excellent source of additional discussion on the the topic of bottlenecks and cycle time:
If you subscribe to the newsletter, you will have access to past editions as well !
Let me get back to the statement: Any wait time is by nature now a “not desired state” especially if the wait time is “very long”
Given the nature of the wafer FAB the ideal case of zero wait time at all steps is not very realistic since there are too many sources of variability in a factory. Therefore experienced capacity planners and production control engineers typically set an expected wait time target per step (and therefore by tool group). Using these expected wait times, the definition of “very long” becomes easier.
For example if
tool group A has an expected wait time of 2hours
tool group B has an expected wait time of 5hours
An actual achieved wait time of 6 hours would be kind of tolerable on tool group B but clearly seen as very high on tool group A.
Setting expected wait times per step and/or tool group depends on a lot of parameters, like:
- planned tool group utilization
- number of tools in the tool group
- duration of process time
- batch tool / batching time
- lot arrival time variability
- many others
I’m curious what the readers of this block think would be an acceptable average wait time for non bottleneck steps in a fully loaded factory.
Let’s assume that most steps in the factory have processing times of 30 – 60 minutes, running on non-batch tools, and the factory is fully loaded = the capacity planners tell you, you can not start more wafers. What would be an acceptable average lot wait time for these steps in your opinion ?
Please vote below, what you would see as good / o.k. / acceptable:
I will share and review the results in my next post.
Merry Christmas and Happy Holidays !
I hope everybody is having a good time with friends and family and after a lot of good food is ready sit down and discuss more details about factory bottlenecks. In today’s post I will start zooming in on the 3 not grayed out metrics from the poll results picture below:
To disclose my personal opinion upfront: I think that “highest average lot wait time” (or metrics that are derived from this) is the most objective way to measure and define what is the true factory bottleneck. But lets discuss all 3 of the metrics a bit.
highest miss of daily moves vs. target
I think every factory in the world is measuring and reporting in some way the number of “Moves” – the number of wafers which were processed/completed on a step in a day, a shift, an hour, for the whole FAB or departments and down to individual process flow steps and grouped by equipment or equipment groups.
“Moves” is a very attractive and popular metric for a lot of reasons:
- Moves can be easily measured and aggregated in all kind of reporting dimensions
- based on the numbers of steps in a process flow (route) it is clear, how many Moves a wafer needs to complete, to be ready to be shipped
- Moves is a somewhat intuitive metric – humans like to count
- target setting seems to be pretty straight forward – “more is better”
I personally think, measuring a FAB via “Moves” as the universal speedometer can be very mis-leading and might drive behaviors – which are actually counter productive – for the overall FAB performance. At the very least a well thought through and dynamic target setting is needed to steer a factory which is mainly measured by the number of Moves. The danger of Moves as the key metric might be less in fully automated factories, since the actual decision making is done by algorithms which usually incorporate a lot of other metrics and targets and therefore Moves are more an outcome of the applied logic, less an overarching input and driver.
In manually operated factories, where operators and technicians make the decisions, which lot to run next and on what equipment, a purely Moves driven mindset can do more harm then good – to the overall FAB performance.
I think a lot has been written and published on this topic and there are strong and different schools of thought out there, but I’m fully on board with James P. Ignizio’s view in his book
In chapter 8 of his book – titled
“Factory Performance Metrics: The Good, The Bad, and The Ugly”
“Moves” get a nice talk – in the “Bad and Ugly” department – for the very reason, that Moves can drive counter productive behavior. If you are interested in this topic – I strongly recommend reading the book.
Before I jump to the next metric – I just wanted to say – that I think that Moves are important to understand and is a useful indicator if used within the right context, but not “blindly” as the most important indicator, which drives all decision making.
highest amount of WIP behind a tool group
Almost one third of the voters picked this metric. Similar to Moves there are a lot of advantages to measure WIP:
- WIP can be easily measured and aggregated in all kind of reporting dimensions
- using “Little’s law” it is easy to define WIP targets
- WIP is a very intuitive metric, especially in manual factories – is my WIP shelf full or empty ?
In general – for daily operations – having a lot of WIP is seen as problematic, since it might lead to lots not moving, starvation of downstream steps and tools, long lot wait times before they can be processed. So high WIP is not a desirable status and very high WIP must be for sure a problem. I think here as well – it depends. For example it depends on what is the target WIP for the given context (like a tool group) to just try to lower the WIP as much as possible (“at all cost”) might lead to generating WIP waves in the factory and to underutilization and lost capacity.
Why do I not 100% subscribe to the highest WIP = the bottleneck ? It is simply, that the tool group with the highest WIP not necessarily has the worst impact on the FAB performance. Here are some data points for this:
Let’s assume we have a very small factory running a very short route – with only 30 steps. If we plot a chart showing the WIP (in lots) per step for each step and sort the steps in the order of the process flow – meaning lot start on the very left and lot ship on the very right – we get what is typically called a line profile chart.
In the picture below our factory is perfectly balanced ( if we define balanced as lots per step – another great topic to talk about) because on each step there are currently 3 lots waiting – or processing.
If we look a bit closer, different steps are of course processed on different tool groups, if we add this detail, the same factory profile looks like this:
For example tool group 2 has 2 steps in the flow and tool group 9 has 3 steps. Our bottleneck metric is the aggregation of the WIP by tool group (“highest WIP behind a tool group”). To find out, which tool group this is, we simply aggregate the same data from the line profile by tool group instead per step:
Tool group number 1 has the highest WIP of all tool groups in this FAB – it clearly must be the number 1 bottleneck – I do not think so. As discussed earlier, there is more content needed. For example, if tool group 1 is a scrubber process, which is typically in the flow a lot of times and it is an uncomplicated very fast process, having the overall highest number of lots there is not necessarily the biggest problem of the factory. Yes, one can argue, still it would be nice to have less WIP sitting at a scrubber tool set, but this is already part of the missing context, I mentioned earlier.
Measuring and reporting WIP is an absolute must in a semiconductor factory, but interpreting WIP levels and assigning them attributes like “high”, “normal” or “low” needs a very good reference or target value. Setting WIP targets should be done via math and science, to reflect what is the overall factory desired WIP distribution – in order to achieve the best possible FAB performance.
Before I close this topic for today – let me say: my simple “perfect balanced” line from the pictures above might not be balanced at all, if we incorporate things:
- different steps / different tool groups have very likely different capacities
- different raw processing times
- might be batch or single wafer tools
- might sit inside a nested time link (queue time) chain
At this point I will pause and hope that I could stimulate some thinking and of course would love to hear feedback from the readers out there. The next post will be fully dedicated to the last open metric …
A big thank you to everyone who voted in my little poll, here are the results:
I kind of expected a picture like this – but what does this mean ? Here is my interpretation:
Bottlenecks are widely known as the one thing one should work on 1st to improve the overall FAB performance. But it seems we have different opinions how to measure and therefore to define what is the bottleneck.
For a real existing FAB, that would mean if different people or groups use a different definition, they would very likely identify different tool groups as the bottleneck – for the very same factory ! Of course we did not yet discuss what type of bottleneck we are talking about: a short term current one, a long term planned bottleneck or any other definition. Nevertheless people would identify very likely different tool groups as the key FAB problem …
Before we discuss this a bit more, I think we need to clarify what is the meaning of “bottleneck for the FAB”. In my opinion the purpose of a FAB is to make money and in order to do this wafers need to be delivered to customers in a way that the overall cost is lower than the selling price. Selling price also means one needs to have someone to sell them to – the CUSTOMER. For the purpose of this bottleneck discussion I exclude topics like yield and quality, assuming these are “o.k. and in control”. I will just focus on the 2 other key metrics for “FAB performance”:
- deliver enough wafers in time –> customer point of view –> cycle time of the FAB
- manufacture enough wafers –> total cost / manufactured wafers –> cost per wafer –> FAB output
So in my opinion, a bottleneck is a tool or tool group which negatively impacts the cycle time of the FAB and therefore the FAB output in general, but more specific the output of the right wafers (products) for the right customers at the right time (aka on-time delivery)
With that in mind, I think we need to define the metric in a way that it measures the impact to these 2 parameters. In a semiconductor FAB the typical unit to track wafer progress through the line is a “lot”. Hence, in order to measure how good or bad a tool group impacts the flow of lots through the line, we need to look a lot related indicator. This disqualifies grey marked ones in the picture below and leaves us withe 3 potential candidates
Let’s have a look at the greyed out metrics.
highest planned tool group utilization
It is very tempting to pick this metric since very high tool utilization signals to some extend, we might reach capacity limits soon. Also it is widely known, that tool groups with high utilization tend to also generate high cycle times. So there is a good chance, that the true FAB bottleneck has a high or the highest utilization – but there is not guarantee – that this is the case. This very much depends also on the overall utilization profile of the factory.
Another interesting topic to discuss in a future post is: What means “high” utilization and “high” cycle time? Similar, how to define “FAB capacity”, which I will discuss also in a later post.
highest actual tool group utilization
Everything I wrote above for the planned high utilization is valid for the actual utilization as well. I just like to add at this point, comparing actual tool group utilization and planned tool group utilization should be a frequent routine, to understand how close or distant the capacity model is able to follow the actual FAB performance – or should I say the the actual FAB is able to follow the capacity model ? You guessed it, an interesting topic for another post …
Before we move on into the next metrics, I like to spend a few thoughts on the topic factory utilization profile. The factory utilization profile is a chart of all tool groups, showing their average utilization ( planned or actual, for selected time frame, like last 4 weeks or last 8 weeks) and the tool groups are sorted in a way, that the tool group with the highest utilization is on the left and the one with the lowest utilization is on the right. A theoretical example is shown below:
Different factories will have different utilization profiles. Even the very same factory will have different utilization profiles over time if things like wafer starts, product mix, uptime or cycle time change. So I always thought it is a very good idea, to keep an eye on that and also compare the profile planned data vs. actual data. An example of comparison (with dummy data) is below.
For example: Look at tool group number 3 ! How likely will become #3 a problem in FAB A vs. in FAB B ?
I think you get the general idea, but there is much more interesting stuff to read out of FAB utilization profiles. Before we go there – have you lately checked / seen your FABs utilization profile ?
most often discussed tool group
This metric has some advantage, since it is not focusing on one specific indicator and if a tool groups is very often in focus, it has for sure some problematic impact on the overall line performance. I rather would choose real data based metric, but for FABs with less developed automatic data generation and data analytics capabilities it is a usable starting point. I also like about this approach – once used for some time – it will inherently drive the demand for a more data based approach – to find out, why is a tool group discussed so often and where to start with improvement activities – which in today’s manufacturing world is an absolute must in my opinion.
highest OEE value
OEE it feels had its peek time when a lot of people talked about it, but it seems lately the topic became a bit quieter. The OEE method itself has its value, if used on the right tool groups with the right intentions. If applied solely to increase the name plate OEE value of every tool group in the FAB, it can become quickly counter productive and hinder the overall FAB performance ( at least if FAB performance is defined and measured as proposed in this post) In my active days as an FAB Industrial Engineer I often used the slogan:
“… if the OEE method is used the right way, its target should be not to increase the OEE value of the tool group, but increase the tool groups idle time …”
If OEE projects are aiming in that direction, they will for sure help to improve the overall FAB performance, but as the key metric to identify the biggest bottleneck I would not recommend to use OEE.
lowest uptime or availability
As mentioned above, uptime is a tool or tool group focused metric and for sure a very important one in every FAB. While low uptime is absolutely not desirable, it is not a good indicator if the tool group is indeed a factory bottleneck, since it will not tell us anything about the actual impact on the FAB without other information.
At this point I will stop for today. In my next post I will spend a bit more time on the 3 remaining – lot related – indicators and will also share, which one I think will be the most useful one to use. As always, I would love to hear feedback from you via a comment. One last thing: I will eventually stop announcing every new post via LinkedIn, so if you want to get notified when there is new content here, please use the email subscription form below
Happy Holidays !
Almost 15 years ago I had the opportunity to attend a 4 day seminar with the authors of the well known book “Factory Physics” LINK
In the opening session we talked about what is limiting factory performance and sure enough bottlenecks came up. The question was asked , what can be done to improve a bottleneck. After a lively discussion between all attendees about what they have done or what they think should be done, Dr. Mark Spearman stated:
“… I propose you walk on the factory floor and look at the tool or tool group and see if it is indeed running (at full speed and efficiency) …”
I had a pretty big “aha !” moment and I remember this, like it was yesterday. But this proposal comes with another interesting challenge:
How do we know what is the factory bottleneck ???
I think to answer this question correctly is the foundation for a lot of things. In its simplest form, the correct answer would lead the folks who actually want to see the bottleneck on the floor to walk to the right tool/tool group. Obviously, there is much more connected to that, for example:
- where to spend resources for improvement activities
- if the bottleneck capacity is used to define the overall FAB capacity, it would be great, if the correct tool/tool group was identified
- where to spend capital to buy another tool
How do we find out, what is the factory bottleneck tool group ? One obvious answer is lets look into data – what data – and how do we know it is indeed the bottleneck. The answer becomes quickly ” … it depends …”
It depends on what is the definition metric and I have seen a few of them so far:
- highest tool utilization as per capacity planning numbers
- highest tool utilization as per actual numbers (daily, last week , 4 weeks ?)
- highest amount of WIP behind tool group
- highest average lot wait time at the tool group
- highest miss of daily moves vs. target
- frequency / intensity a tool group is discussed in morning meeting as a “problem kid”
- lowest tool group uptime ( or availability)
- highest OEE value
I’m pretty sure all of these metrics have some value, if used in the right context. I do have my own opinion, what I would select as the key metric, to declare the FAB bottleneck, but I really like to get some discussion going here, therefore I like to run a little poll, to see what the majority would select as the key metric:
I can’t wait to see the results. I’m fully aware that the answer selection is not that straightforward without more content – so if you like to provide thoughts, please use the comment functionally at the bottom.
I will share and discuss the results in my next post, sometime before the holidays
I finally decided to start my own blog. It will be all about – surprise –
Factory Physics and Factory Automation
Why am I doing this ?
Over the years I had the chance to work very closely in different companies and their semiconductor factories and I found that especially in the non leading edge companies/FABs a lot of folks are very interested in these topics – but often even basic principles are not known or understood. This was often true for all levels throughout the organization, from operators up to the senior level leadership.
Throughout my professional career I enjoyed learning about these principles and using them for active decision making. I also realized that I liked sharing thoughts about those principles.
To keep this going also in the future, I will start in a loose frequency posting topics, questions and more. I hope you will get something out of it for your daily business and also contribute to a fruitful discussion and exchange.
Stay tuned for more and if you have suggestions for topics, please let me know, I will for sure give them a try.
enter your email address and click subscribe: