Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.21 No.1 pp.58-73

Early-stage Project Outcome Prediction Considering Human Factors

Satoshi Urata*, Takaaki Kawanaka, Shuichi Rokugawa
Fujitsu, Ltd., Tokyo, Japan
Institute for Innovation in International Engineering Education, School of Engineering, The University of Tokyo, Japan
Research Center for National Disaster Resilience in National Research Institute for Earth Science and Disaster Resilience, Ibaraki, Japan
*Corresponding Author, E-mail:
January 13, 2021 June 22, 2021 December 20, 2021


In the early stages of a project, project managers need a way to connect concrete actions to the factors that affect project success. This study aims to upgrade project management methodology by using machine learning technologies to predict project results. Using a new deep learning model called “deep tensor,” we predict project results at the time of completion—including quality, cost, and delivery time—by evaluating the project’s state in its earliest stage using various types of project knowledge assets. The prediction results suggest that the predictive accuracy of the deep tensor model is more accurate than that of the random forest or multiple regression model. The way to use this model to recommend specific advice by using the factors that most influenced the model’s predictions is also presented. This research provides a method for sharing difficult-to-share knowledge across projects and will be useful for early, tangible improvement measures in the project execution phase.



    Various statistical analyses and machine learning techniques have been applied to the field of project management. The primary areas of project management that have been studied are estimating the effort, cost, and duration of initiatives required to develop schedules and budgets (Pospieszny et al., 2018) and predicting the number of failures (Hall et al., 2012). Besides these areas, new data based on deeper insights into the nature of a given project, project management behaviors, project procedures, human factors, and the external environment have been reported as crucial project success factors. Attempts have been made to more accurately understand the nature of a project and achieve success by considering these perspectives (Alias et al., 2014). Despite a large body of prior research on project teams, there have been few attempts at formally integrating findings and developing conclusions on how a project team’s success should be measured and what factors most strongly influence the success (Sivasubramaniam et al., 2012). This lack of consensus adds to the difficulties that managers faced when attempting to improve project team performance (Liu and Cross, 2016).

    This study aims to improve project management using machine learning models that predict outcomes on the basis of data related to project knowledge. Some important factors that are strongly related to a project team’s performance, including human factors and the project manager’s knowledge (hereinafter called “project knowledge assets” or simply “knowledge assets”), are used. These knowledge assets include organizational knowledge, defined as the validated beliefs and understanding that a firm has about its relationship with its environment. These are company-specific resources that are indispensable in creating value for a firm (Nonaka and Takeuchi, 1995;Kamasak and Yucelen, 2009;Serrat and Serrat, 2017). In this paper, corporate activities are regarded as multiple projects, and successful projects are regarded as value creation. This paper aims to use knowledge assets to understand the main reasons underlying the success of a software project and support its use. In other words, the focus is on eliciting knowledge for actions that contribute to the success of the project based on the prediction regarding the project. It is not a method for deriving general factors but for identifying individual important factors for the success of a system construction project.

    In this paper, after constructing a prediction model using project knowledge assets that evaluate the project up to the upstream process, that is, the requirement definition process, we aim for eliciting knowledge for actions that contribute to the success of the project on the basis of the prediction of each project. A rating scale is used to facilitate the objective assessment of person-to-person interactions and behaviors that occur within the requirements definition governing project quality (Watanabe and Obata, 2012). A diverse sample of 134 project teams from different industries is used to develop and test the machine learning models.

    The remainder of the paper is organized as follows. Section 2 is the literature review; Section 3 provides an overview of artificial intelligence (AI) model development; Section 4 describes the study methodology; Section 5 presents the results, which are discussed in Section 6. Section 7 summarizes and proposes areas for future research.


    2.1 Relationship between the Final Success or Failure of a Project and Risk

    Several studies have been reported on the relationship between the final success or failure of a project and risk. As a study on the analysis of risk factors, Conrow and Shishido (1997) summarized the major risk factors in risk management. Additionally, research is being conducted to analyze the final state of the project by using analysis results of risk factors. Avritzer and Weyuker (1999) evaluated 50 projects with a high risk of failure at a relatively early process stage. Additionally, Mockus and Weiss (2000) modeled the probability of failure due to software changes to predict the risk of new changes, and Yamaguchi et al. (2013) and Kamata et al. (2014) conducted studies on how to quantitatively monitor project activity.

    Ropponen and Lyytinen (2000) used principal component analysis to classify risk factors into large categories and investigate the relationship between environmental factors and risk in software development. Wohlin et al. (2003) demonstrated the possibility of project success prediction by applying principal component analysis on the basis of evaluation data regarding the project based on expert judgment. Jiang and Klein (2000) investigated the link between project efficiency and risk factors and concluded that a lack of team expertise and a lack of clarification regarding role definitions increased project risk. As a risk item, Kasser and Spring (1997) proposed the categorized requirements in process (CRIP) approach, which clarifies progress regarding requirement definition. Takagi et al. (2005) designed a questionnaire from five project perspectives: requirements, estimates, plans, team formation, and project management activities. Using logistic regression analysis with the resulting data, a model was constructed to characterize the confusion project (Takagi et al., 2005). However, although their research characterized risks, they did not make predictions regarding the project.

    One of the causes of project failure is the requirement definition failure. As a result of conducting a questionnaire survey to companies, based on 390 responses from 245 companies, the causes of construction delays were “bad planning/requirement definition” and “frequent specification changes,” which account for more than 40%. Similarly, cost overruns are “frequent specification changes” and “additional work (planning/design/ development),” and quality defects are “planning/requirement definition defects” and “test/migration/introduction defects,” which account for more than 40% of the causes (JUAS, 2019). Requirement definition is an important work process in information technology system construction, and the most important cause of a failed project is a requirement gap. Wiegers and Beatty (2014) summarized requirements-related risk factors into five subfields of requirements engineering: requirements elicitation, requirements analysis, requirements specification, requirements validation, and requirements management. Regarding risk in the early stages of a project, for example, Blanchard and Blyler (2016) reported that early-stage decision making in the system life cycle determines 70%~80% of life cycle cost. Additionally, Boehm (1981) verified that the cost of correction or change (rework) that goes back to the initial process of the project increases as the project progresses from the analysis of the actual data of the system development project. “Reworking” in system development accounts for 40%~50% of the total system development cost (Shull, 2002), and error in requirements definition accounts for 70%~85% of the reworking cost (Leffingwell, 1997). Thus, the quality of the requirements definition process affects the final success or failure of the system development project, so it can be said that risk evaluation in the requirements definition process is important for making predictions at an early stage of the project. For various risks, it is possible to contribute to the success of the project by identifying a project with a high possibility of failure at an early stage and taking corrective measures before the failure becomes apparent.

    2.2 Prediction of Project Success/Failure

    Project forecasts to estimate cost or development effort have been studied by various researchers (Boehm et al., 2000). Mizuno et al. (2005) collected data from 40 projects from software development companies and showed that project confusion can be predicted with high accuracy by applying methods such as Bayesian classifiers. Abe et al. (2006) constructed a model that can predict the achievement of quality with an accuracy of approximately 70.4%~87.5% using the data of a specific contractor. One problem with these studies is that the timing of forecast implementation was not elucidated.

    To clarify the timing and make predictions, Mori et al. (2013) suggested that it is possible to make predictions with a certain ability in the timing after the programming process. Oba et al. (2017) attempted to predict Japanese information communication technologies (ICT) project success using actual values such as project period, development man-hour and confirmed that prediction can be performed at a timing of a progress rate of 50%. Kusano et al. (2017) predicted project success using variables acquired using the earned value management (EVM) and confirmed prediction accuracy according to the change in the progress rate of the project. The progress rate was changed by 10% from 10% to 90%, and the project was predicted nine times in total at each timing. It was shown that approximately 92% of projects requiring attention can be identified by the middle of the project (Kusano et al., 2017). These studies clarified prediction timing, but the prediction could only be made after the project had progressed approximately 50%. For earlier prediction, it is expected that the timing will be earlier, such as when the request is confirmed.

    To make predictions during the requirements definition process, Kawamura and Takano (2020) collected features consisting of 17 items and obtained a project success prediction model with high prediction accuracy (84.0%) using the naïve Bayes classifier. Nevertheless, although the evaluation items used were effective for predicting the success of the project or the development man-hours, extracting the information necessary for improving the project from the prediction results was difficult because the evaluation items were abstract.

    Regarding the evaluation items of the project, several studies have been conducted to predict project performance such as development man-hours and project success/failure using the project data collected from the system development project (Takagi et al., 2005;Tsunoda et al., 2005;Liu et al., 2006;Basha and Ponnurangam, 2010;Debari et al., 2012). Prediction at design phase has proposed on the basis of generally collecting data in the firms such as the type of development project (new or improved), main development language, and architecture (Debari et al., 2012). These various project performance prediction studies are published in the Software Development Data White Paper (IPA, 2019) (hereinafter referred to as the “Project Data White Paper”). These are studies that use only the existing standard evaluation items, such as the data published in the data white paper. Organizational problems are known to be a leading cause of system development failure and poor performance (Kearney, 1990;Buchanan, 1991;Hornby et al., 1992;Clegg et al., 1997;Ahn and Skudlark, 1997), studies using only standard endpoints do not specifically mention human factors related to organizations and management that greatly influence the success or failure of such projects (Ahn and Skudlark, 1997;Doherty and King, 1998;Doherty and King, 2001;Conrow and Shishido, 1997).

    Regarding the evaluation items of projects related to human systems, in the research of Takagi et al. (2005) and Mizuno et al. (2005), project success prediction was conducted using evaluation items that they developed themselves primarily on the basis of studies regarding software development risk factors (Kasser et al., 1997;Williams et al., 1999;Fairley, 1994;Wiegers and Beatty, 2014). Mizuno et al. (2005) also mentions engineering skills requirements (e.g., a lack of explanation on the request side) and project management skills (e.g., a lack of review of project plans). Kawamura and Takano (2020) created 17 evaluation items and used them to predict project success. The evaluation items were organized into three main categories, namely, project content, development process, and people and activities, with reference to McLeod and MacDonell (2011), organizational background, project content, development process, and the framework of people and activities. In COCOMO II (Benediktsson et al., 2003), which is a typical software development effort estimation model, five scale factors (e.g., team cohesion) and 17 types of cost drivers (e.g., analyst ability) are used. Estimates are made by evaluating up to six levels. These evaluation items include the evaluation of human factors related to organization and management, and developed models are considered to be effective only for predicting project success or development man-hours. However, the evaluation items are abstract and do not help learning actions to improve the project from the prediction results.

    In this paper, a project prediction model is developed using an evaluation scale (Watanabe and Obata, 2012) to facilitate the objective evaluation of person-to-person interactions and behaviors in the requirements definition process.


    3.1 Challenges of the Deep Learning Model

    Few studies have applied deep learning techniques to project management. Deep learning is part of a broader family of machine learning methods that are based on artificial neural networks. Conventional machine learning techniques require careful engineering and considerable domain expertise to construct machine learning systems and design a feature extractor that transforms raw data into a suitable internal representation or feature vector. Deep learning creates computational models that are composed of multiple processing layers using representations of data with multiple levels of abstraction. A key aspect of deep learning is that these layers of features are not designed by human engineers: they are learned from the data using a general-purpose learning procedure (LeCun et al., 2015).

    By using deep learning technology, a common axis of how much risk currently exists is presented. Although the judgment logic is based on a black box and the cause is not clear, when predicting whether the project succeeds or not using this method, the risk is high if failure is predicted and low if success is predicted. However, it remains unclear which knowledge assets have a greater impact on the project or the mechanism by which the lack of a specific skill specifically affects the project outcome. The deep learning model makes the basis of prediction a black box. Clarifying the features used by the black box becomes a bottleneck. For this reason, it is extremely difficult to guide the project manager on how to use the predictions to deal with specific risks.

    Generally, AI-based systems often lack transparency; although the black box nature of these systems allows for powerful predictions, the systems cannot be directly explained (Adadi and Berrada, 2018). Figure 1 is an image created by the author from the diagram cited in an explainable AI project by Defense Advanced Research Projects Agency (DARPA) (Gunning, 2017). The vertical axis on the right-hand graph presents the accuracy, and the horizontal axis presents the possibility of explanation. Although machine learning that uses deep learning methods can obtain a high degree of accuracy, the relationship between input and output can become very complicated and virtually impossible to explain to the user.

    To address this problem, it is possible to identify the factors that contribute to an estimation using a deep learning method called deep tensor (DT).

    3.2 DT

    DT is Fujitsu-developed machine learning technology that enhances deep learning with graphical data (data representing the connections corresponding to classifications in supervised learning). Using DT along with highaccuracy estimation can allow users to identify the estimation factors. There are two major features of DT. First, it is possible to learn and make inferences from graphical data. Second, it is possible to identify the factors that influenced the prediction results and the degree of those factors’ influence.

    Even for experts, discovering important features when there is a large amount of data is difficult. The DT approach makes it possible to extract features mechanically and systematically, making it possible to easily find the rules and features of the relationships.

    Figure 2 presents a conceptual diagram of the learning process using the DT model.

    With conventional machine learning methods, expert judgment must be added to the design of the feature set used to feed the model and produce graphical data, and accuracy is limited. DT represents graphical data in a mathematical form called a “tensor.” By converting the graphical data into a unified tensor expression using tensor decomposition, the feature quantities of the graphical data are extracted. Learning is performed by inputting the resulting tensor expression into a neural network. Since the tensor representation is trained such that the predictive accuracy is high, it has a structure in which characteristic factors that contribute strongly to prediction are extracted from the input graph. In other words, the tensor decomposition itself is optimized using the extended error backpropagation method, an extension of conventional neural network learning technology, so the design of feature quantities is automated. This enables highly accurate predictions (Maruhashi, 2017;Maruhashi et al., 2018).

    DT includes a technique for identifying the factors that contribute significantly to the prediction result from the tensor representation by performing a linear inverse transformation to identify the prediction factors of the input graph that largely contributed to the prediction result. Thus, by using DT, besides achieving high-precision predictions, it is possible to identify key prediction factors (Fuji et al., 2019).


    4.1 Research Flow Summary

    Our goal is to create a machine learning system that generates advice for project managers, leading to actions that increase project success. The prediction model at the core of this system is named the Project Result Prediction (PRP) model. Three versions of this model have been developed: a multiple regression (MR) model, a random forest (RF) model, and a deep learning-based DT model. By evaluating project knowledge assets at the same time, the upstream processes of the project are implemented, and the subsequent trend of the project can be evaluated. If the prediction result shows that the project is judged to be in a risky state, the factors that affect it are fed back to reduce project risk.

    Steps of the research flow are described as follows:

    • Step 1: Data about the project manager’s knowledge and project results are collected (134 projects are described).

    • Step 2: Data shaping for the input to the machine learning model is described, along with the classification of the models’ training data and test data.

    • Step 3: The points for modeling are described for the three types of models.

    • Step 4: The accuracies of the models are compared, and the feature extraction of the project is described.

    • Step 5: The concrete support method of the project manager is described.

    4.2 Data Acquisition

    There are three types of data items used in this study: project evaluation items (Type A), project performance items (Type B), and standard evaluation items (Type C). Table 1 presents the evaluation items of knowledge assets, project performance items, and existing standard evaluation items.

    Type A items comprise 42 items of potential human factors that have been identified through discussions with experts. These were collected using a five-point Likerttype response scale. These were used as explanatory variables for model development. We established a rating scale to facilitate an objective assessment of the humanto- human interactions and behaviors that occur within the requirements definition that governs the service quality of a project (Watanabe and Obata, 2012). The model was developed and evaluated with reference to the evaluation items developed by Watanabe and Obata. It provides a detailed view of the efforts and status of the customer organization, the skill of adjusting the requirements of the vendor, and the project management ability of the vendor. Evaluation items that reflect knowledge assets that represent concrete project practices are developed as follows. First, we extract important practical tools from the activity history of actual systems development projects. Next, we derive a story that leads to the success or failure of the project through ethnographic interviews of experts using a questionnaire conducted on the basis of the Q classification method. Lastly, we prioritize the extracted practical items through discussions with experts (Watanabe and Obata, 2012). Table 2 shows the evaluation items of Type A project knowledge assets.

    Type B items reflect the actual result at the time of project completion and were used as objective variables. Type B items include quality (assessed by the difference between the planned and actual values of the number of changes in specifications), cost (assessed as the rate of excess compared with the specified cost), and delivery (as delayed from a specified date). For items that refer to the IPA/SEC White Paper, the answer format is based on the IPA/SEC’s data collection method.

    Type C contains standard project attributes and comprises 12 items. Type C items were used when best practices for projects were extracted from similar project groups using the clustering method described in the discussion section (STEP 5). These items consist of existing standard evaluation items such as development type and development scale, as published in the white paper from the Software Engineering Center Information-Technology Promotion Agency (IPA, 2018). Type C items include items that can be used for project prediction, but in this paper, they are not included in the explanatory variables in view of the provision of advice information after prediction. Table 3 shows the standard evaluation items of Type C.

    4.3 Research Flow

    • Step 1. Data Acquisition

    Data about projects completed by a system development vendor were collected. The vendor has a community of project managers who were surveyed using a questionnaire that included data items to be analyzed in this study. The project managers responded to the questionnaire regarding the projects in which they had been involved in the past. Two surveys were conducted: the first in 2012 and the second in 2018. From the first survey’s responses, we obtained satisfactory data for 58 targeted items; these were used as test data. From the second survey, we collected 76 data items that were targeted for the study and used all of them as training data. Figure 3 indicates how parts of the acquired data were used to create the model. The answers to each question regarding the state of the upstream process in each project were used as the explanatory variables. Results from questions about quality, cost, and delivery time at the time of project completion were used as objective variables.

    A limitation of this method is that the responses of a single team member were used to measure team-level variables in this study. Most of the respondents were project managers, and we considered them to be reliable for evaluating those projects in which they were involved. However, collecting data from entire teams would allow researchers to measure team-level variables and evaluate the extent of agreement across team members’ responses more accurately.

    • Step 2. Data Preparation

    The respondents’ answers to the questionnaire were converted to the appropriate formats for model development. For the MR and RF versions of the PRP model, model development used the format described in Figure 3. For the DT version of the PRP model, the format must be converted. Figure 4 presents the input data structure for the DT version of the PRP model development. The definitions of the explanatory variables and objective variables are the same as in the data structure illustrated in Figure 3.

    • Step 3. Modeling

    Identifying risk factors and their causal relationships is an important task in risk management. Risk factors are interrelated; one risk factor might be the cause of another. Considering such a causal chain, the standard risk model proposed by Smith et al. has been adopted in this study (Preston and Guy, 2002). Figure 5 shows a conceptual diagram of the standard risk model.

    As shown in Figure 5, the standard risk model is regarded as a chain of three elements: a driver, an event, and an impact. A risk impact is the potential loss caused by risk. A risk event is a specific phenomenon or condition that causes loss. A risk event driver that exists in a project or its environment might cause a risk event, and it corresponds to the project evaluation item (A) in Figure 6. The PRP model focuses on the relationships between riskinducing factors (risk event drivers). The risk event driver, which is the root factor of risk, is defined as an explanatory variable. The model’s user does not consider the specific probability of the occurrence of each risk corresponding to a risk event driver, but it is implicitly included in the model to predict the impact on QCD.

    We compared three different model versions: the MR model, the RF model, and the DT model. The models learn how project knowledge assets relate to requirement definitions and how interacting with customers affects quality, cost, and delivery time. The goal is to improve project outcomes by predicting the result when a project is in an early stage. Figure 5 presents a schematic image of the PRP model.

    The MR and RF models were built using R, a free, open-source programming language commonly used for statistical analysis. The “lm() function” in R was used to create the MR model. This function creates the relationship model between the predictor and response variables.

    RF is an ensemble learning method that combines multiple models to create a more powerful model. RF is a combination of tree predictors such that each tree depends on the values of a random vector sampled independently, using the same distribution for all trees in the forest. It has been found to perform very well compared with other classifiers and is robust against overfitting (Breiman, 2001). The RF implementation included in R was used to create an RF model. The R package implements Breiman’s RF algorithm (based on Breiman and Cutler’s original FORTRAN code) for classification and regression. The DT model was developed with the learning function of DT. There are two types of DT learning functions, a GPU-based learning mode and a CPU-based learning mode. This time, the GPU-based learning mode was selected. At the time of testing; the estimation function estimated the classification of test data using a learned model. The prediction result explanation function generates scores for the predicted results, that is, it identified the key prediction factors described in Section 3.2.

    • Step 4. Evaluation

    The accuracy of the models was compared for the two predicted patterns for quality, cost, and delivery time. First, we compare the prediction accuracy for the fivestep classification of the project result type presented in Table 1. Second, we compare the accuracy of the two-step classification of the project result type, that is, whether the project is forecast to be on schedule, ahead of schedule, or behind schedule. If the project result type = 4 or 5, the project is classified as “better or as planned.” If the result type = 1, 2, or 3, the project is classified as “worse than planned.” The results are described in Section 5.

    This study is aimed to contribute to the improvement of the success rate of the project by providing advice information to the project on the basis of the result of the prediction model. Advice for both projects that are predicted to succeed and predicted to fail must be considered. Hence, an accuracy that focuses on both failure and success was used as an index. Accuracy is expressed as Accuracy = (TP + TN) / (TP + FP + FN + TN) as the ratio of projects that can be predicted correctly in the total prediction results. Here at TP (True Positive), a sample is predicted to be positive and its label is actually positive. At TN (True Negative), a sample is predicted to be negative and its label is actually negative. At FP (False Positive), a sample is predicted to be positive and its label is actually negative. At FN (False Negative), a sample is predicted to be negative and its label is actually positive.

    • Step 5. Deployment

    To make a new project evaluation, the project is evaluated for the items in Table 1. Next, the prediction is performed using the developed models and types (quality, cost, and delivery time), in which the key prediction factors are extracted. A report showing an example of the project prediction results, including the execution image, is provided in Section 6.2.


    5.1 Training Data

    Data from the 2018 survey were used as training data (n = 76). They comprise projects that involve various industries, operations, and development dates. Table 4 lists the industry classifications as an attribute of the project used for training data.

    5.2 Test Data

    Data from the 2012 survey were used as test data (n = 58). They also comprise projects that covered various industries, operations, and development dates. Table 5 lists the industry classifications of the projects used for the test data.

    5.3 Verification Result (Prediction)

    The predictive accuracy of the DT version of the model and that of the MR and RF versions of the model (the conventional methods) are presented on the lefthand side of Table 6 as the results of multiclass classification. Regarding the accuracy of the MR model, if the predicted value is in the range of ±10% of the true value, it is counted as a correct prediction. The accuracy of the DT model is the highest for all of the prediction items in the five-item classification. The accuracy of binary predictions from the DT and RF versions of the model regarding whether the project will go as planned are presented for each project result type on the right-hand side of Table 6.

    From the results in Table 6, the predictive accuracy of the DT and RF versions of the PRP model in the case of binary classification was relatively high. The DT version of the PRP model is more accurate than the RF version. The accuracy of binary predictions for quality is the same for the DT and RF models, but for cost, and delivery time, the DT is more accurate than RF.

    On the basis of the evaluation results of the QCD items, the projects were classified as either a success or a failure. Of the three evaluation items—Q, C, and D—those that were successful were classified as a “success,” and those that failed for any one item were classified as a “failure.” The accuracy of binary predictions from the DT and RF versions of the model regarding whether the project will go as planned are presented in Table 7.

    From the results in Table 7, the DT is more accurate than RF in the case of binary classification for project.

    5.4 Verification Result (Factor Extraction)

    It is possible to extract the evaluation items that most influenced each prediction target using the characteristic of the DT method that made it possible to calculate the factors that influenced the prediction result and the degree of influence that could be calculated. A score indicating the items and their degrees of influence is obtained. Table 8 presents the items that had a major impact on the model in predicting quality, cost, and delivery time.

    For the project data used as training data, the contribution of the feature items used by DT for prediction is shown. For each target project, we obtained the factors that most affected the estimation. In other words, for each of the 58 projects used as test data, the relationship between the evaluation categories (Q, C, D) and the evaluation items illustrated in Table 8 was obtained.

    Among the items presented in Table 8, those related to quality are presented in Table 9. The top five items that contributed the most when predicting a certain project are also shown. In Table 9, when predicting a certain project, the items that are characteristic compared with the multiple projects used for the training data are shown by item numbers. This corresponds to the item numbers in Table 2. The contents are described in the item contents. The item with the highest rank is the item with the highest contribution. These results can be used to propose methods of improving project management.


    6.1 Effectiveness of the PRP Model

    It is difficult to make a direct comparison with other studies because the conditions, such as the project to be predicted and the verification method of the prediction ability, are different, but from the results of the project success prediction at the time of request confirmation, the accuracy of the project is approximately 84%. Considering the research result (Kawamura and Takano, 2020)— that project success can be predicted correctly—it can be said that the prediction mechanism shown in this paper has a certain predictive ability. When predicting the final success or failure of a project at the timing when the requirements definition process is finalized and determining the degree of involvement of the project support organization, it is said that a prediction ability of approximately 80% will be useful as reference information for decision making. In the case of Kawamura and Takano (2020), the predictive accuracy met a certain level on the basis of the prediction results of the binary classification of the DT and RF versions of the PRP model in this paper. Moreover, when the DT and RF versions of the PRP model were compared, it is suggested that the predictive accuracy of the DT version model was higher than that of the RF version model.

    6.2 Project Management Support Framework Including the PRP Model

    The purpose of this paper is not only to predict project success at the early stages of a project but also to provide specific advice to the project manager. For each project to be predicted, the items that contributed to the prediction shown in Table 8 can be obtained. By using similar project feature points, it is possible to advise project managers who oversee new projects to help them make better decisions.

    The schematic image of the recommendation mechanism for project management is shown in Figure 7.

    First, (1) classify similar projects. Similar projects are extracted using the standard evaluation items in Table 3. It is possible to classify them into similar projects using cluster analysis. Here, for the sake of simplicity, the case of extracting similar projects by focusing only on the development scale (man-hour project value per man-month conversion plan) is described. For example, by classifying large-scale projects (200 man-months or more) and small and medium-sized projects (<200 man-months), it is possible to select to which category the new project to be predicted is closer.

    Next, (2) extract the factors with the best results. The best practice projects in large-scale projects—that is, projects with high-performance values for quality, cost, and delivery time—are extracted, and the items shown in Table 2 that have the greatest influence on the project promotion and are expected to be effective are extracted. Similarly, feature items are extracted for best practice projects in small- and medium-sized projects.

    Then, (3) recommend improvement advice. For large-scale and medium- and small-scale best practice projects, the information is recommended to the project manager based on the advice wording created on the basis of the items evaluated in (2) that are evaluated to have an impact on project promotion.

    Regarding the wording of advice, it is necessary to consider and prepare comments for improving the project. Table 10 shows an example in which the advice wording in the quality category is shown. The rank and item number are the same as the values shown in Table 9.

    Consequently, when a low score is predicted for a certain item, it is possible to make specific recommendations on improving project results on the basis of a similar project, regarding the actions that could be taken to improve the project. In other words, advice based on the experience and know-how of each project manager can be said to provide a means of sharing the drivers of success across multiple projects.

    The deep learning model predicts the project state at the time of completion. This is presented in the form of a common axis regarding how much risk exists, on the basis of a comprehensive assessment of human factors. This makes it possible for one to objectively understand the relationship of multiple projects across human factors. For individual projects, detailed, prioritized recommendations can be provided regarding items that are lacking in the project.

    It is possible to provide advice to the project manager by implementing the series of steps described above: (1) the classification of similar projects, (2) the extraction of factors for the best results, and (3) the recommendation of improvement advice. It is possible to prioritize and present improvement advice regarding what actions can be taken to improve the project results for each item of quality, cost, and delivery time.

    In the model presented, the items Q, C, and D are connected in parallel, with the triple constraint not considered. To improve the interpretation of the triple constraint and its dynamics in the future, discussion on the QCD items that should be prioritized when making recommendations, considering the restrictions of the trilemma, will be necessary (Van Wyngaard et al., 2012). That is to say, the PRP model cannot possibly determine the QCD item that should be prioritized when a triple constraint occurs. Therefore, the priority of QCD items is left to the discretion of the person receiving the recommendation.


    In this paper, a project prediction model using a machine learning model is developed as well as models using the DT and RF models. With these models, the results of quality, cost, and delivery time at the time of project completion were predicted on the basis of the state evaluation at the initial stage of the project. From the prediction results of a binary classification (when the project is as planned or better than planned vs. worse than planned), relatively high-level prediction accuracy results were ob-tained for the PRP models of the DT and RF versions. The prediction results suggest that the predictive accuracy of the DT model is more accurate than that of the RF and MR models. By predicting the outcome of a project using machine learning technology, a method is presented that could lead to actions for the success of a project at an early stage. A framework for project management support, including a machine learning model, is also shown. In other words, it is a series of frameworks that support a project manager’s decision making by first preparing data and then making predictions using the PRP model while classifying similar projects by extracting the factors of the best practice project that gave the best results and making recommendations for improving project results.

    For future research, models could be enhanced by accumulating more data similar to what was used in this study to improve the accuracy of predictions and by expanding the types of data included. It is expected that this approach will extract various features that are difficult for project managers to grasp. Depending on the characteristics of the project, the content of the required knowledge assets of the project and the priority of actions for its success are considered to change. The framework presented in this paper contributes to the construction of appropriate knowledge assets in the project by using machine learning. Thus, objective information can be presented for overcoming the cognitive differences caused by differences in project characteristics.



    Trade-off between explainability and accuracy.


    Conceptual diagram of the learning process of the DT model.


    Schematic image of the data structure for model development.


    Schematic image of the data structure of the DT version of PRP modeling input.


    The standard risk model.


    Schematic image of the PRP model.


    Schematic image of the recommendation mechanism for project management


    Sample items and answer methods

    Project knowledge assets (Type A)

    Standard evaluation items (Type C)

    Industry type (training data)

    Industry type (test data)

    Prediction accuracy using the prediction method

    Project prediction results (based on QCD)

    Items that contributed to predictions and the degree of contribution

    Evaluation items selected in the quality category

    Recommendation items selected in the quality category


    1. Abe, S. , Mizuno, O. , Kikuno, T. , Kikuchi, N. , and Hirayama, M. (2006), Estimation of project success using Bayesian classifier, Proceedings of the 28th International Conference on Software Engineering, 600-603,
    2. Adadi, A. and Berrada, M. (2018), Peeking inside the black-box: A survey on explainable artificial intelligence (XAI), IEEE Access, 6, 52138-52160.
    3. Ahn, J. H. and Skudlark, A. E. (1997), Resolving conflict of interests in the process of an information system implementation for advanced telecommunication services, Journal of Information Technology, 12(1), 3-13,
    4. Alias, Z. , Zawawi, E. M. A. , Yusof, K. , and Aris, N. M. (2014), Determining critical success factors of project management practice: A conceptual framework, Procedia-Social and Behavioral Sciences, 153, 61-69.
    5. Avritzer, A. and Weyuker, E. J. (1999), Metrics to assess the likelihood of project success based on architecture reviews, Empirical Software Engineering, 4, 199-215.
    6. Basha, S. and Ponnurangam, D. (2010), Analysis of empirical software effort estimation models, Computer Science, 7(3), 68-77, Available from:
    7. Benediktsson, O. , Dalcher, D. , and Woodman, M. (2003), COCOMO-Based effort estimation for iterative and incremental software development, Software Quality Journal, 11, 265-281
    8. Blanchard, B. S. and Blyler, J. E. (2016), System Engineering Management (Fifth Ed.), John Wiley & Sons,
    9. Boehm, B. (1981), Software Engineering Economics, Prentice Hall, New York.
    10. Boehm, B. W. , Abts, C. , Brown, A. W. , Chulani, S. , Clark, B. K. , Horowitz, E. , Madachy, R. , Reifer, D. , Steece, B. (2000), Software Cost Estimation with COCOMO II, Upper Saddle River, NJ, Prentice-Hall.
    11. Breiman, L. (2001), Random forests, Machine Learning, 45, 5-32,
    12. Buchanan, D. (1991), Figure-Ground reversal in systems development & implementation: From HCI to OSI. M. Nurminen, & G. Weir (eds) Human Jobs and Computer Interfaces, North Holland, 213-226.
    13. Clegg, C. , Carery, N. , Dean, G. , Hornby, P. , and Bolden, R. (1997), Users’ reactions to information technology: Some multivariate models and their implications, Journal of Information Technology, 12, 15-32.
    14. Conrow, E. H. and Shishido, P. S. (1997), Implementing risk management on software intensive projects, IEEE Software, 14(3), 83-89.
    15. Debari, J. , Kikuno, T. , Ikuchi, N. , and Hirayama, M. (2012), On prediction of project success using incomplete project data, Information Processing Society of Japan, 53(2), 662-671.
    16. Doherty, N. F. and King, M. (1998), The importance of organisational issues in systems development, Information Technology & People, 11(2), 104-123.
    17. Doherty, N. F. and King, M. (2001), An investigation of the factors affecting the successful treatment of organisational issues in systems development projects, European Journal of Information Systems, 10(3), 147-160,
    18. Fairley, R. (1994), Risk management for software projects, IEEE Software, 11(3), 57-67,
    19. Fuji, M. , Morita, H. , Goto, K. , Maruhashi, K. , Anai, H. , and Igata, N. (2019), Explainable AI through combination of deep tensor and knowledge graph, Fujitsu Scientific & Technical Journal, 55(2), 58-64.
    20. Gunning, D. (2017), Explainable Artificial Intelligence (XAI), DARPA Information Innovation Office.
    21. Hall, T. , Beecham, S. , Bowes, D. , Gray, D. , Counsell, S. (2012), A systematic literature review on fault prediction performance in software engineering, IEEE Transactions on Software Engineering, 38(6), 1276-1304.
    22. Hornby, C. , Clegg, C. , Robson, J. , McClaren, C. , Richardson, S. , and O’Brien, P. (1992), Human & organizational issues in information systems development, Behaviour & Information Technology, 11(3), 160-174.
    23. Information-technology Promotion Agency, Japan (IPA) (2018), White Paper 2018 on Software Development Projects in Japan, Available from:
    24. Japan Users Association of Information Systems (JUAS), (2019), Corporate IT Trend Survey Report 2019, Nikkei BP, 205-213.
    25. Jiang, J. and Klein, G. (2000), Software development risks to project effectiveness, Journal of Systems and Software, 52(1), 3-10.
    26. Kamasak, R. and Yucelen, M. (2009), Knowledge asset management: Knowledge assets and their influence on the development of organizational strategies, 7th International Knowledge, Economy and Management Congress, 1977-1984.
    27. Kamata, S. , Ominami, M. , Yamauti, M. , Yamakawa, N. , and Chikusa, M. (2014), Consideration for early detection of project status deterioration, Journal of the Society of Project Management, 16(3), 9-14.
    28. Kasser, J. and Spring, S. (1997), What do you mean, you can’t tell me how much of my project has been completed?, Systems Engineering, 7(1), 700-705.
    29. Kawamura, T. and Takano, K. (2020), Project outcome prediction at the requirement establishment for ICT vendors, Journal of Japan Industrial Management Association, 71(3), 137-148.
    30. Kearney, A. T. (1990), Barriers to the successful application of Information Technology, DTI & CIMA, London.
    31. Kusano, Y. , Yokoyama, M. , Liu, G. , Tamura, T. , Ishii, N. , Okada, K. , and Yokoyama, S. (2017), Method to dynamically predict project success or failure based on past data, Journal of the Society of Project Management, 19(3), 29-34.
    32. LeCun, Y. , Bengio, Y. , and Hinton, G. (2015), Deep learning, Nature, 521, 436-444.
    33. Leffingwell, D. (1997), Calculating your return on investment from more effective requirements management, American Programmer, 10(4), 13-16
    34. Liu, W. H. and Cross, J. A. (2016), A comprehensive model of project team technical performance, International Journal of Project Management, 34(7), 1150-1166.
    35. Liu, X. , Kane, G. , and Bambroo, M. (2006), An intelligent early warning system for software quality improvement and project management, Journal of Systems and Software, 79(11), 1552-1564.
    36. Maruhashi, K. (2017), Deep tensor: Eliciting new insights from graph data that express relationships between people and things, Fujitsu Scientific and Technical Journal, 53, 26-31.
    37. Maruhashi, K. , Todoriki, M. , Ohwa, T. , Goto, K. , Hasegawa, Y. , Inakoshi, H. , and Anai, H. (2018), Learning multi-way relations via tensor decomposition with neural networks, Thirty-Second AAAI Conference on Artificial Intelligence, 3770-3777.
    38. McLeod, L. and MacDonell, S. G. (2011), Factors that affect software systems development project outcomes: A survey of research, ACM Computing Surveys, 43(4), 24-56,
    39. Mizuno, O. , Abe, S. , and Kikuno, T. (2005), Development of project confusion predicting system using bayesian classifier: Towards its application to actual software development, SEC Journal, 1(4), 24-35.
    40. Mockus, A. and Weiss, D. M. (2000), Predicting risk of software changes, Bell Labs Technical Journal, 5(2), 169-180.
    41. Mori, T. , Kakui, S. , Tamura, S. , Fujimaki, N. (2013), Deployment of project failure risk prediction model, Journal of the Society of Project Management, 15(4), 3-8.
    42. Nonaka, I. and Takeuchi, H. (1995), The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation, Oxford University Press, New York.
    43. Oba, M. , Kono, A. , Kamata, S. , Ushiroda, S. , Takagi, S. , Yamade, K. , and Hane, T. (2017), Predictive detection of unprofitable projects by utilizing artificial intelligence (AI), Journal of the Society of Project Management, 19(4), 15-20.
    44. Pospieszny, P. , Czarnacka-Chrobot, B. , and Kobylinski, A. (2018), An effective approach for software project effort and duration estimation with machine learning algorithms, Journal of Systems and Software, 137, 184-196.
    45. Preston, G. S. and Guy, M. M. (2002), Proactive Risk Management, CRC Press, Boca Raton.
    46. Ropponen, J. and Lyytinen, K. (2000), Components of software development risk: How to address them? A project manager survey, IEEE Transactions on Software Engineering, 26(2), 98-112,
    47. Serrat, O. and Serrat, O. (2017), Managing knowledge in project environments, Knowledge Solutions, 509-522.
    48. Shull, F. , Basili, V. , Boehm, B. , Winsor Broun, A. , Costa, P. , Lindvall, M. , Port, D. , Tesoriero, R. , and Zelkowitz, M. (2002), What we have learned about fighting defects, Proceedings 8th IEEE Symposium on Software Metrics, 249-258.
    49. Sivasubramaniam, N. , Liebowitz, S. L. , and Lackman, C. L. (2012), Determinants of new product development team performance: A meta-analytic review, Journal of Product Innovation Management, 29(5), 803-820.
    50. Takagi, Y. , Mizuno, O. , and Kikuno, T. (2005), An empirical approach to characterizing risky software projects based on logistic regression analysis, Empirical Software Engineering, 10(4), 495-515,
    51. Tsunoda, M. , Ohsugi, N. , Monden, A. , Matsumoto, K. , and Satou, S. (2005), Software development effort prediction based on collaborative filtering, Journal of Information Processing Society of Japan, 46(5), 1155-1164.
    52. Van Wyngaard, C. , Pretorius, J. , and Pretorius, L. (2012), Theory of the Triple Constraint: A Conceptual Review, 2012 IEEE International Conference on Industrial Engineering and Engineering Management, 1991-1997.
    53. Watanabe, S. and Obata, A. (2012), Human factor explanatory variables that predict project results, 2012 IPSJ/SIG Softw. Eng., 1-6.
    54. Wiegers, K. and Beatty, J. (2014), Software Requirements (3rd Ed.), Microsoft Press, Redmond.
    55. Williams, R. C. , Pandelios, G. J. , and Behrens, S. G. (1999), Software Risk Evaluation (SRE) Method Description: Version 2.0, SEI Joint Program Office, 153.
    56. Wohlin, C. and Andrews, A. A. (2003), Prioritizing and assessing software project success factors and project characteristics using subjective data, Empirical Software Engineering, 8(3), 285-308,
    57. Yamaguchi, K. , Chikada, M. , Nishizawa, K. , Nakagawa, K. , Endou, T. , Furugen, K. , Tanimoto, S. , and Saito, N. (2013), A quantification method of project activity situation by project management office’s monitoring, Journal of the Society of Project Management, 15(6), 23-28.
    Do not open for a day Close