Kinship, gender and social links impact on micro group lending defaults

Joint liability aspires to improve micro-loan performance through the support of, and pressure from, the group borrowers. This paper examines how the group composition, in terms of the mixture of kinship (Family) ties and social (Friends and Neighbours) ties among the borrowers, affects the default rates. Using binary logistics regression and three machine learning models, responses from 507 group micro-loan borrowers from four major Moroccan cities were analysed. The results show that the stronger the family and kinship ties are within a loan group, the higher is the default rate. On the contrary, the stronger the social ties are among the group, the lower is the default rate. Other key findings include that the diversion of fund usage from investment to consumption is not found to

Obtaining start-up finance is challenging given that the sources are restrictive (Jackson & Young, 2016). Funds often come from personal savings and the support from family and friends Kotha and George (2012).This group (Mia & Lee, 2017) of entrepreneurs is often forced either to borrow via informal means, such as money lenders, at exceptionally high interest rates, or accept the exclusion of formal credit, thus losing out on investment opportunities (De Aghion, 1999).
Micro-finance (MF) has been introduced as a mechanism which provides small sums of credit and other financial services to the poor and vulnerable who do not have access to formal financial institutions (Mohammed & Wobe, 2019;Wijesiri et al., 2017). Micro loans are lent to the poor who lack the common lending criteria adopted by mainstream commercial banks, such as collateral, guarantors and/or prior credit history. MF represents an inclusive approach for the financially excluded and can be an effective mechanism to alleviate poverty and promote socio-economic development and growth (Mohammed & Wobe, 2019;WorldBank, 2004).
MF's operations bridge the social and economic spectrums (Bello-Bravo & Amoa-Mensa, 2019) and aim to serve the dual objective of financial sustainability and social outreach (Wijesiri et al., 2017, p. 63). While working towards their social and economic mission, MF provision is hindered by the risk of repayment default (Guha & Chowdhury, 2013;McIntosh & Wydick, 2005;Mia & Lee, 2017;Mohammed & Wobe, 2019;Murthy & Mariadas, 2017). For example, McIntosh and Wydick (2005) found a high credit risk facing the MF institutions (MFIs) in Bangladesh, illustrated by 32% of the Grameen Bank's MF loans in the Tangail area being overdue by 2 years or more. Mohammed and Wobe (2019) study in Ethiopia revealed that 45% of the borrowers in the study area of Wondo Genet Woreda either did not or could not repay their loans following the credit schedules.
The sustainability of the MF sector relies on the loan repayment which, on one hand, replenishes the MFIs' credit capacity and, on the other hand, demonstrates borrowers' success in using the funds on productive economic activities. The CGAP 2009 states that a loan is in default when a borrower can or will not pay back the loan and when the MF institutions (MFIs) no longer expect the loan to be repaid (WorldBank, 2004).
Although there has been a movement towards individual lending over the past decade or so, group lending has been widely adopted by MFIs worldwide (Abbink et al., 2006;Chowdhury et al., 2014). The popularity of group lending can be explained by its reach to more people, thus widening the benefits to the wider community. Also, group lending can provide MFIs with a kind of social collateral as group members provide guarantee to one another (Grameen System;Singh et al., 2017). When any member defaults, the rest of the group share the outstanding repayments, or all lose access to future loans (De Aghion, 1999). When a member defaults, they suffer a loss of reputation, social shame, deprived access to some or all of the social activities and resources, and ultimately the exclusion from the social group. Joint liability allows the MFIs to shift some of the costs and risk associated with group screening, loan monitoring and repayment enforcement to the borrowers (Besley & Coate, 1995;De Aghion, 1999;Dufhues et al., 2012;Mohammed & Wobe, 2019). Given the stated benefits of group lending, why do MFIs face increasing rates of loan default? The question has driven this research study to explore whether and how the social and family ties among group members affect loan default.

| How group joint liability is linked to social and family and kinship ties?
The relationship between the members of a social network varies in tie strength. A network which is made up of members who have strong ties, such as family or kinship, is likely to develop a strong sense of solidarity and trust which can be used to leverage opportunistic behaviour and build resources and resilience which are particularly critical at the times of uncertainty, adversity and/or hardship (Jackson & Young, 2016). Bourdieu (1973) suggested that families possess their own symbolic and/or material resources which can be used to generate benefits for their own members. Bonding capital manifests itself in family businesses. The bond among the owners and managers creates informal self-regulating, selfreinforcing control mechanisms which complement, and in some cases replace, the formal control systems which are emphasised by the Agency Theory to protect the interests of the principals (Mustakallio et al., 2002). Family businesses are often governed by a dual control system. These characteristics have empowered family businesses with a higher-than-average ability to survive, even in difficult times. Based on the above literature, the following two hypotheses are established: H1. : A micro loan group made up of family members (Kinship) has a lower probability of default than non-family (social ties) groups.
H2. : The stronger the family ties between group members, the lower the probability of micro loan default More recent research have switched their focus from quantitative to qualitative factors and examined the effect of different types of social capital and/or the balance between the social bonds and bridges on sustainable development (Fransen, 2015;Hunt et al., 2015;Jackson & Young, 2016;Serra, 2011). As Narayan and Pritchett (1999) argued, while strong social ties promote cohesion and inter-dependency among members, the lack of diversity in the group impedes innovation, skills, outlook, financial resources and opportunities. In a similar way, Jackson and Young (2016) observed that over-connected members tend to lack the variety of resources and opportunities to drive the transformation into high return production groups.
Extending social interaction and relationships to dissimilar groups or individuals helps promote social tolerance and bridge social divides (Fransen, 2015). Intra-group interactions extend individuals' reach to a wider pool of information, intellectual capital, financial resources, power and/or opportunities which result in stronger bridging social capital (Fransen, 2015;Hunt et al., 2015;Nahapiet & Ghoshal, 1998;Paxton, 2002;Putnam, 2004;Putnam et al., 1994;Woolcock & Narayan, 2000) contributing to better social and economic outcomes (Hunt et al., 2015). Based on the above literature review, the following two hypotheses are drawn up for testing: H3. Micro loan groups with a higher diversity of social and family ties between members leads to a lower default rate than groups consist of only family members H4. Micro loan groups with a higher diversity of social and family ties between members leads to a lower default rate than groups consist of only social members. Mustakallio et al. (2002) argued that the close-tie emotions and relationships embedded in the governance structure and mechanisms of family businesses can have both a supportive and destructive effect on the operation and strategic decision making, hence the economic performance. The owners and managers of family businesses are brought together by family ties, rather than by choice. Hence, their skillset, vision, attitude, and commitment may not be fully aligned with the business needs (Mustakallio et al., 2002).
The close ties between group members can have a negative effect on the rescheduling of loan repayment (Dufhues et al., 2012). Ahlin and Townsend (2007) study in Thailand revealed that the close bond and regular interaction between the group loan members can conceal important project information from the lender, impede social sanctions and promote collusion. All group members can work together to default on repayments and shield one another from the social pressure coming from the wider community (Chowdhury et al., 2014). While Mustakallio et al. (2002) refers to close ties as family links, the definition and measurement of tie strength have not been specified in the work of (Ahlin & Townsend, 2007;Dufhues et al., 2012). Since previous studies have found that close ties could exacerbate loan default, the following two hypotheses are established to further examine the nature and strength of ties on loan default: H5. The stronger the family ties, the higher is the probability of micro loan default.
H6. The stronger the social ties, the higher is the probability of micro loan default.
In terms of the strength of social ties, we have hypothesised that neighbourhood ties are stronger than friendship ties as the former can carry out higher social sanction in the case of group loan default than the latter due to the proximity of locations (Besley & Coate, 1995).
Besides the nature of the ties (Family and Kinship versus Social links) which forms the key variables of this study, a few other determinants of loan default are adopted as control variables. Explanations of the choice are illustrated below.

| Control factors affecting loan default rates
We have chosen to include some of determinants identified by previous studies in our study as control variables to test the relationship between the loan default rate and the main predictive variables (Family and Kinship and Social links).
Education level has been identified as a factor which influences loan repayment performance (Mohammed & Wobe, 2019). In our survey, micro loan borrowers are all having a lower level of education (secondary school and below). This commonality has enabled us to exclude education level as a determinant of default rate in this study.
The effect of age on loan default is mixed. For example, Mokhtar et al. (2012) found that borrowers in the 46-55 age group had a higher probability of having repayment problems. Mohammed and Wobe (2019) also found more defaulters in the 38-47 years group than in the younger age groups. These findings contradicted the common view which sees older borrowers as more responsible for loan repayment, whereas young and inexperienced borrowers, due to their age and immaturity, would increase the default rate (Dorfleitner et al., 2017).
To examine the effect of age on loan default, this study will focus on respondents who are aged between 30 and 49. The reason behind the choice is that this age bracket covers the dominant segment, 46%, of the micro loan beneficiaries in Morocco (CentreMohammed, 2020), while the rest of the total percentage is almost equally distributed among three other age brackets.
Other determinants of loan default, such as group size, gender, loan purpose and financial hardship, are adopted by this study. Group size is unique to group lending rather than individual loans. Increased group size is believed to be more effective from the resource sharing perspective which benefits project performance and repayment ability (Ahlin, 2015). Along with this line, it is claimed that joint liability induces homogeneous matching (Ghatak, 1999(Ghatak, , 2000. Nonetheless, there is a counter argument against large group size. For example, Ahlin (2015) found that homogeneous matching is lost when the group size is larger than two (n > 2). The study has also shown that if information deteriorates sufficiently with the growth of group size, an intermediate group size is better, for outreach and efficiency purposes, than both extremes (Ahlin, 2015).
A group size of five is used by the Grameen Bank along with many of its replications, such as Green Bank (Giné & Karlan, 2014). Some lenders use slightly smaller groups, for example some Al Amana groups were 3-4 members (Crépon et al., 2015), while others use larger groups. As group lending can potentially benefit from liability and resource sharing, the following hypothesis is established: H7. Bigger group size decreases micro loan default rate.
Prior studies, such as Greenbaum et al. (2019), Hoque (2010), Coyle (2000), Özdemir and Boran (2004), Mokhtar et al. (2012) have shown that loan default may be a result of borrowers' unwillingness and/or inability to repay. These results have reflected the importance of initial screening of borrowers' ability and commitment and the group composition before granting the loans (Nawai & Shariff, 2010). The following hypothesis is drawn up to test this aspect: H8. Financial hardship leads to group micro loan default.
Loan purpose is claimed to be a determinant of loan default (Baesens et al., 2005). For example, Okorie (1986) found that borrowers who received a non-cash loan for investment purposes, such as seeds, fertilizer and equipment, demonstrated higher repayment rates than other borrowers who received cash loans. This was because some borrowers misused the cash, diverting it into personal consumption instead of investing the money in making their business more productive (Okorie, 1986).
Nonetheless, some studies found that a higher loan default risk is associated with borrowers who used the funds for small business rather than non-small business purposes (Serrano-Cinca et al., 2015). This argument is supported by citepcader2011small who found more than 40% of the 90,134 small business samples observed failed after 3 years in business. In contrast to this high failure rate, Agarwal et al. (2007) found only 3.59% of the car loans (non-small business loans) defaulted out of a sample of 6996 observations. Previous results on the impact of loan purpose on default were inconsistent.
Since one of the main objectives of micro loans is to initiate or grow small businesses, moving away from the main investment objective seems more likely to be leading to non-productive activities and hence loan default. We therefore propose to test the following hypothesis: H9. Deviating funds from investment to personal consumption leads to micro loan default.
It has been argued that female borrowers are better payers (Dinh & Kleimeier, 2007;Roslan & Karim, 2009;Salazar et al., 2008;Schreiner, 2004;Vigano, 1993) than their male counterparts. This is possibly due to their stronger work ethics and financial discipline (Bhatt & Tang, 2002;Pitt & Khandker, 1998). Compared to the male borrowers, females tend to be risk averse (Croson & Gneezy, 2009), and thus more likely to engage in less risky projects (Sharma & Zeller, 1997) and spend money on productive expenditure to enhance income and empower their family (Mohammed & Wobe, 2019). These characteristics in turn increase female borrowers' ability in loan repayment. The following hypothesis is established for testing: H10. Male dominated micro loan groups are more likely to default than female dominated groups.

| Logical development of the research
Facing increasing default rates and criticism of inefficiency (Wijesiri et al., 2017), over reliance on private funds and even mission drift (Mia & Lee, 2017), MFIs need to find ways of improving their policies and approach to accomplish the dual social and financial inclusion mission. This study has pulled together the literature on social capital theory and loan repayment performance and default with the aim of identifying how the characteristics of the borrowing groups influence loan default. The key objective of this study is to help MFIs refine their lending policies and guidance to reduce the loan default rates and develop sustainable credit capacity. The study also aims to support the borrowers in partner selection and group relationship management. In order to accomplish the research objective, this study aims at using the logistics regression method as many credits default studies have used it where the dependent variable is binary Vallini et al. (2008). However, the logistics regression itself, as a traditional statistical method, suffers from relying on strong assumptions, such as the type of error distribution, additivity of the parameters within the linear predictor, and proportional hazards Rajula et al. (2020). This is not the case when using machine learning (ML) methods. The later methods have the advantage of training statistical models from historical data citepmuller2016introduction. This feature makes machine-learning models focused on making predictions as accurate as possible, while traditional statistical models are aimed at inferring relationships between variables Rajula et al. (2020). ML is at the intersection of many other disciplines like statistics and computer science. While traditional statistical models focus on metrics such as R 2 , p-values and statistical significance, the ML techniques focus on out-of-sample forecasting and the biasvariance trade-off (Gogas & Papadimitriou, 2021).Unlike traditional statistical methods, the ML techniques have relevant importance due to their limited dependence on assumptions and their importance in processes automation (Li et al., 2020).The application of machine learning in our study is also supported by the fact that this is widely used in the literature of credit risk evaluation in microfinance (Bakshi, 2021;Beketnova, 2020;Bhatore et al., 2020;Condori-Alejo et al., 2021;Ruiz et al., 2017). Given the qualities that machine learning possesses, we contrast the result from the traditional logistic regression with that from the machine learning approach.
The remainder of the paper is organised in the following way: Methodology of this paper is explained in Section 2. The analysis and findings are illustrated in Section 3 and finally a summary, recommendations, policy implications and extensions are presented in the conclusion section.
2 | METHODOLOGY 2.1 | Data collection Data for this Study were collected in Morocco and obtained from 507 clients who had an experience of being part of a joint liability micro-loan group. An Initial target of 1000 survey was planned but such a target had proved to be difficult to reach given the reluctance of some of the respondents to engage fully or participate in the survey. However, we have managed to collect a sample of 507 respondent, representing a response rate of 50.7%. The data collection period was initially set to be 3 years, yet the difficulty in collecting the data was an incentive to extend the study period to 6 years (2015-2020). The data collection started initially at SIST-Cardiff Metropolitan University Branch in Morocco and continued in 2020 in collaboration with Sheffield Hallam University. Due to the geographical dispersion of the survey, students from the four different campuses in the country [Marrakesh (South), Casablanca (West), Rabat (West), and Tangier (North)] provided the necessary support in running the survey and collecting its data. The survey targeted respondents with low education level (primary to Secondary) and age range of 30-49. This is to exclude the age and education effect on default rate.
The survey questions (see Appendix) are divided to three parts: I. Dependent variable questions, II. Control variable questions, III. Predictive Variable questions. The survey starts by asking the subjects about the main dependent variable question. That is, whether their group loan experienced a default.
The second part of the survey focuses on the control variable questions: The purpose of the loan, Financial Hardship in paying the loan, Group Gender composition, and the size of the group.
And finally, the third part of the survey focused on the predictive variables of interest. Strong family links (SFL) represents group members who are of the immediate family of the interviewee. Due to the variety of definition around members to include in this category, we restrict it to include (Spouse, Parents, Grandparents, Children, Grandchildren, Siblings and In-laws mother, father, brother, sister, daughter and son).
Weak family links (WFL) refer to members of the group who are part of the extended family. This category includes all family member other than the immediate family members described in the strong family category.
Friends (FRI) are members in the group who do not fit in the neighbourhood category or family category of anyone member in the group. Neighbours (NEI) are members in the group who do not fit neither in the friends category nor in the family category of anyone member in the group 1 . We provide the number of responses by year and location (see Table 1).
Four cities were included in the survey. Casablanca is taking the largest part in terms of responses given its population size and comparative economic power compared to the rest of the cities. With the exception of the year 2020 which has limited data (due to Corona Virus outbreak and lock-downs), the data are fairly distributed along the period of the study.

| Model specification
The response variable, Default, is a binary variable (1 for default or 0 for no default). Therefore, the logistic regression method is the choice adopted for this study. This method is widely used in credit default studies where the dependent variable is binary Vallini et al. (2008). This default is assessed in two stages. In stage 1, the probability of default is assessed through the primary predictive variables of this study: Kinship variables (WFL, SFL) and Social links variables (NEI, FRI). This stage excludes the control variables. The probability of default, P, is given as: The second stage includes all the primary predictive variables in addition to the control variables: Loan Purpose (LoanPurpose) default due to financial hardship (FinHardship) and number of males in the group (MaleGr) and the number of group members (GroupSize).
Loan purpose is a binary variable that takes two values: 1 if the loan is for investment purposes and 0 otherwise. The same applies to the 'Financial hardship' variable. It takes 1, if the group suffered financial hardship or 0 otherwise. The probability of default, P, in this stage, is given as: Given the qualities that machine learnings possess, as mentioned in the introduction, we will contrast the logistics regression results with those from the logistics regression.

| Descriptive results
We describe the distribution of the studied sample in terms of the predictive variable and the control variables (see Figure 1).
In terms of members type presence, the weighted average presence of each type in a group is presented. We can note, in this sample, a slight dominance of the social groups (52%) in group lending compared to the kinship groups presence (48%; see Figure 1a).
In terms of sub-categorical formation, we can note the following: From a social link side point, groups are mainly formed of friends than of neighbours. With an overall presence of 38% in all groups, friends' members represent more than three times the presence of neighbours (see Figure 1b).
From a kinship link, we can note the dominance of strong family links as opposed to weak family links. With an overall presence of 30% in all groups, Strong family links members represents more than two times the presence of weak family links (see Figure 1b).

F I G U R E 1 Members type weighted presence in a group [Colour figure can be viewed at wileyonlinelibrary.com]
The sample also shows that groups are mainly composed of strong family links members and friends than from neighbours or weak family links groups (see Figure 1b). In terms of Control variables, male presence (58%) in groups dominates their female (42%) counterparts (see Figure 1f). In terms of the objective of the loan (see Figure 1d), the sample shows that the majority (66%) had the intention to use the loan for investment purposes. While this is the main purpose of a micro-finance loan, 34% have shown a deviation from this purpose and that their intention was to buy consumer goods rather than starting a small business. In terms of financial distress (see Figure 1e), 57% of the groups genuinely suffered from a financial distress while 43% did not. This probably could suggest that default can at times be voluntary rather than a forced outcome of financial distress.
In terms of the number of members in a typical group (see Figure 1c), nearly half the sample shows that group are composed of four members or higher. This is followed by groups of three members at (31%) and then groups with two members (21%).

| Results of the logit model
The aim of this logistic model is to investigate the effect of the group composition in terms of kinship and social ties on the risk of default. We will be using a training set of 80% and a testing set of 20%. In terms of goodness of fit, The Hosmer and Lemeshow test (0.397 < 0.5) is not significant. This means that the model has a good fit. In terms of the predictive ability of the model, the classification table shows that the model classifies groups correctly in their original groups with a rate of 79%. Approximately 80% of defaulted groups and 78% of solvent groups are correctly classified.

| Regression results
The Logit model results (see Table 3) show, based on the odds ratio, that strong family links in a group have the highest contribution to default followed by groups with weaker family links. On the other hand, social links reduces the default rate. Groups composed of neighbours are better than groups composed of friends in lowering default. 'Loan purpose' and 'Financial Hardship', on the other hand, have a negative β but was not significant. However, the p-values of the logit fit give the significance of the given variate in influencing the predicted result.
The p-values are not significance values for the null hypothesis themselves. Therefore, we run a Chi-square test (see Table 4) and use the significance of such test to validate the hypothesis:

| DISCUSSION
Based on the χ 2 there was sufficient statistical significance to accept or reject all but 2 of the hypothesis (see Table 5).
We discuss these hypotheses and the decisions around them in two steps: (1) primary Predictive variables and (2)  In terms of the first hypothesis, the results in Tables 3  and 1 show that the family links variables (SFL, WFL) are having higher odds of default in comparison with the social links variables (FRI, NEI). This result goes against the findings of (Bourdieu, 1973;Mustakallio et al., 2002) who are advocates of strong family ties as a positive factor in business success. Our results, then, suggests that for a successful group liability loan, a higher weighting in terms of group members should be given to social members, such as friends and neighbours, rather than family members.
In terms of H2, the results shows that the stronger the family links are, the higher are the odds of loan default. This is against the findings of Jackson and Young (2016) who suggests that stronger family ties provide stronger solidarity and trust among members which can used to leverage opportunistic behaviour and build resources and business resilience. On the other hand, our findings are aligned with those of Dufhues et al. (2012), and Ahlin and Townsend (2007) revealed that the close bond and regular interaction between the group loan members can conceal important project information, impede social sanctions, and promote collusion. Our results then suggest that if a group is to be composed of family members, then members who are of strong family ties (immediate family) should not be part of the group. Possible explanation of this, could be that each member of the family strongly relies on the performance of the other members. Therefore, even if default occurs, social penalty is less severe as it is self-contained within the household and members shield one another from the social pressure Not enough evidence to accept H9: Deviating funds to consumption rather than investments leads to default.
Not enough evidence to accept H10: Male dominated groups are more likely to default than their female counterparts.
Accept come from the wider community. While this result rejects H2 it does confirm our H5 that stronger family ties increase the odds of default. In terms of H3, the results in Tables 3 and 1, show that family members contribute positively to default while Neighbours and friends reduce it. This suggest that, by introducing members of the social category (friends and neighbours), default will be reduced than if the groups is composed of family members only. We therefore accept H3.
H4 is different from H3 as it tries to test whether diversity in the group leads to a lower default rate than if the groups are composed of social members only (friends and Neighbours). The results from Tables 3 and 1 show that family members contribute positively to default while Neighbours and friends reduce it. This suggests that including family members in a diversified group should exacerbate default rather than reduce it. While the covered literature recommends diversity in the group (Fransen, 2015;Fransen, 2015;Hunt et al., 2015;Nahapiet & Ghoshal, 1998;Paxton, 2002;Putnam et al., 1994;Putnam, 2004;citepwoolcock2000social Hunt et al., 2015, our recommendation is rather conditional. We recommend diversity in a group that contains family members so that the defaulting odds are mitigated by social pressures from friends and members. We do not recommend diversity in a group that contains friends and neighbours as introducing family members would simply exacerbate the odds of default. We therefore reject H4. While H5 has been discussed in conjunction with H2, H6 is largely aligned with H4. While H6 emphasises the importance of social ties over kinship ties, it adds an extra layer: that of the social strength. As previously stated, while laying out the foundation for the hypothesis in the introduction section, we assumed that neighbours have stronger ties than friends as they carry higher social pressure and social sanction (Besley & Coate, 1995). From Tables 3 and 1, we can see that neighbours have the highest contribution to 'no default' compared to friends. This means that the stronger the social ties are, the lower is the probability of default. Our finding contradicts those of Dufhues et al. (2012) and Ahlin and Townsend (2007), even though the strength of the ties, in their study, is not specified. That is, whether the ties are family ones (kinship) or social ones (friends and Neighbours). H6 is therefore rejected.

| Hypothesis testing results: Primary predictive variables
H7-H10 are hypothesis around the control variables (Gender, Financial Hardship, Loan Purpose, Group Size) impact on the Group lending default rate. H7, tests whether or not an increase in the group size, should lead to lower default rate. Our findings suggest that an increase in the size of the group increases the default rate. This result is inconsistent with the findings of (Ahlin, 2015;Cassar et al., 2007;Ghatak, 1999;Ghatak, 2000) who claim that an increase in group size induces more social pressure to pay and foster homogeneous matching. Our results are however consistent with the findings of Ahlin (2015) (homogeneous matching is lost with bigger size) and (Ahlin, 2015; In intermediate group are more efficient). Our findings therefore reject H7. A possible explanation for this result is that the higher is the group size, the more likelihood that the group suffers from free riders who rely on others to pay the group loan. We recommend lender to exercise more precautionary measures for larger groups.
H8, tests whether financial hardships lead to default. Our results are quite surprising as this factor impact was not found significant in leading to default. This suggests that group default can be voluntary rather than an inevitable consequence of financial hardship. There is not enough evidence to suggest that financial hardship can lead to default in group lending.
H9, verifies whether deviating funds to consumption, rather than investment, leads to default. our results shows that there is not enough evidence to suggest that loan purpose affects default rate. Our results, therefore, do not support the views of Baesens et al. (2005) and Okorie (1986) who claim that deviating funds to consumption leads to default. Our results do not, as well, support the findings of Serrano-Cinca et al. (2015), Cader andLeatherman (2011), Agarwal et al. (2007) who claim that the odds of investments (small businesses) to default are higher than those of consumption (car loans).
In terms of H10, we verify whether males are having more odds to default than females. From Table 3, this result is confirmed as males are nearly three times more likely to default. This result is consistent with previous literature (Bhatt & Tang, 2002;Croson & Gneezy, 2009;Dinh & Kleimeier, 2007;Pitt & Khandker, 1998;Roslan & Karim, 2009;Salazar et al., 2008;Schreiner, 2004;Sharma & Zeller, 1997;Vigano, 1993). Many possible reasons for this outcome have been suggested by the former literature where females have higher risk aversion, higher hard work ethics and financial discipline compared to males. We would also add that in developing countries, such as Morocco, females with low education level, are usually housewives, and spend more time engaging and socialising with their female neighbours. The high social engagement of females, compared to males, make them more socially concerned about the social sanction they can be subject to in case they default on their joint liabilities.

| Logit results compared with other machine learning models
Our study focuses on the use of the logistics regression to predict credit default in microfinance. However, the suitability of this method needs to be contrasted with machine learning methods given the qualities they possess as discussed earlier in the introduction. Below is a description of three widely machine learning models that are going to be used for this purpose: • The support vector machine (SVM) is a predictive model that can be used to solve both linear and nonlinear problems. The idea of SVM is to find the best separation hyper-plane to separate two or different classes using the training data. • The classification and regression tree (CART) classifier is a popular predictive if-else algorithm that works for both continuous and categorical variables. One of the most import advantages of CART algorithm is that it allows simple interpretation and visualisation of data patterns. • The K-nearest neighbours (KNN) is one of the simplest and the popular machine learning algorithms. The KNN is based on features similarity. Its main idea is to assign a new object to the most frequent class among its K nearest neighbours.
Ten-folds cross validation is used to optimise the hyper parameters of the different machine learning models. The models are evaluated using four metrics (see Table 6): Accuracy score, Sensitivity, area under curve (AUC) and receiver operating characteristic curve (ROC; see Figure 2). The accuracy score is the number of the classes correctly predicted to the total number of samples. The sensitivity represents the percentage of actual defaulted groups that are predicted as defaulted group. The area under curve (AUC) is a metric that ranges from 0 to 1. The higher the value of the AUC is, the higher the performance of the model is.
The obtained results show that the logistic regression is marginally superior to the other three models (much closer to the SVM model).
The ROC curve (see Figure 2) plots the sensitivity (the percentage of well-classified defaulted groups) against 1-specificity (the proportion of false defaulted outcomes) at various probability thresholds. In our case, we noticed that all the ROC curves are close to the upper left corner, which indicates a high level of overall accuracy. Based on the model prediction, the area under the ROC curves (AUC = 81% for the logistics regression and SVM) are also calculated; they show that the models have a good capacity to discriminate between defaulted and solvent groups.

| Further investigation of the control variables
The results around the control variable shows that two variables (financial hardship, loan purpose) are insignificant to default while Gender and Group Size are significant. Due to their significance, we analyse Gender and Group Size further.

| Further investigation: Gender
We show the default rate for females and males in different social and kinship categories (see Figure 3). Interesting results emerged typically for males in different social and kinship categories. We can see clearly that the highest default rate of males happens in a group dominated by strong family links. The default is lessened as the strength of the family links weaken (WFL). Males have their lowest rates of default when they are part of a socially dominated group (Friends and Neighbours). The visual remark is not enough to draw an empirical conclusion. We therefore run an association test between males defaulting and their membership to either social or kinship groups. In this regard, the Chi-squared tests are significant at 1% level suggesting a strong association between males defaulting and kinship or social ties (see Figure 3c). We can then support the graphical visualisation that as we go from a family dominated group (SFL and WFL) to a Social Group (FRI and NEI), male defaults decreases.
This could be explained by a cultural factor in terms of males exercising a relative dominant power compared to a female in a developing country. We can conclude from this, that males care more about the social sanction coming out from friends and neighbour, than they do in a family context. To reduce the default rate, it is then recommended that males are to be included in a social group rather than a family group. Females on the other hand have maintained relatively their high level of 'no default'. This suggests that females can fit into any of the social and kinship groups while maintaining a very low default rate.
When it comes to group size, we can notice that females have shown a very low default rate across the different group sizes. On the other, males default rate is shown to decrease as the group size increases (see Figure 3b).
To conform that there is an association between group size and males default we run a Chi-squared test. The latter test is significant at 1% level suggesting a strong association between males defaulting and group size (see Figure 3d). We can then support the graphical visualisation that as group size increases male defaults decreases. We can conclude that to decrease the males default rate, it is advisable that they are included in a large socially dominated groups (friends and Neighbours).

| Further investigation: Group size
We show that larger groups carry the highest default rate (see Figure 4a). This is consistent with our decision to reject H7 that higher group size decreases default.
Adding the subcategories of default together (Kinship links = WFL + SFL, Social links = FRI + NEI), we can assess default of social and kinship categories according to group size (see Figure 4b). We can see once again that default is higher in larger group size, and it is apparent more in social links groups than it is in Kinship groups. This shows that social links group may lose their default reducing ability if a group size is large (in our case N > 3). We can conclude from Figure 4 that large group size exacerbates default for both social and kinship groups with more impact on social groups.

| CONCLUSIONS
In this study, we tried to investigate the impact of group composition in terms of Social (friends and neighbours) and Kinship (strong family and weak family) ties on the default rate of Joint Liability Loans. We have found statistical evidence that Kinship ties contribute positively to default while social ties reduce it. more specifically, the stronger the kinship ties, the higher is the probability of default. Control variables (Loan Purpose, Gender, Financial Hardship, Group Size) were introduced in this study.
The introduction of the control variables did not change the impact of the main predictive variables of our study (Social and Kinship ties variables), but rather confirm some literature results and give rise to new ones. For example, females were confirmed, consistent with literature, to reduce default while males increase it. However, what we found is (1) that females maintained their superior repayment ability across different social and kinship groups and group sizes; and (2) that males default rate is increased if they are part of a kinship dominated group and is reduced if they are part of a Socially dominated group.
Another control variable, Group size, was confirmed to increase the default rate. What we found however is that group size has a bigger impact on the default rate when the group is socially dominated (Friends and Neighbours) and that a normal group size is better, in terms of lower default rate, than the two extremes. The other two variables were found insignificant: Financial Hardship and Loan Purpose. Namely, the fact that, groups genuinely incurred a financial hardship was not an explanatory factor of default. This means that group defaults could be voluntary rather than an inevitable consequence of financial hardship. Similarly, diverting funds to consumption rather than its original investment destination, was not found to be explaining defaults. This suggests that even if the funds were not invested as originally planned in small business activities, defaults may not necessarily occur.
The results from this study can have important policy implications for micro-finance institutions. Namely, groups are to be either socially dominated groups (first best case) or diverse in terms of the composition (Social and Kinship). Groups which are kinship dominated are not recommended as they exacerbate the default rate.
Another policy implication is the confirmation of female's better repayments rates compared to males. but we suggest a solution for male's high default rate. Males are recommended to be part of a socially dominated group (Friend and Neighbours) instead of a family dominated group.
In terms of Group Size, it is recommended that groups are of normal size (in our case N = 3). This case is shown to perform better than the extremes (N = 2 or N > 3). Increasing then size above the normal size, may nullify the benefit of a lower default rate characterising a socially dominated group and exacerbates the default of a kinship dominated group.
Finally, Loan Purpose and Financial Difficulty are found not significant in identifying the default rate, therefore we recommend giving them the least scoring in loan and default evaluation.
For further extension of this work, we propose extending the survey in different developing countries. The findings should confirm whether the current results are extendable and, therefore, whether policies can be applicable in different counties.