The jobs of the future will be driven by technology and innovation. 65% of children entering primary school today will have jobs that do not yet exist. Half of them are girls.
At the moment the tech industry is male-dominated with the vast majority of tech workers being men. Nevertheless, even with additional barriers to working in the industry, women have championed the advancement of transformative technology and digital education. This shows that women, together with other underrepresented groups, can bring diverse skills and creative solutions to the technology industry.
The lack of inclusion of women in the tech industry also has an economic cost. $1 trillion from the GDP of low- and middle-income countries in the last ten years has been lost due to the absence of women in the tech industry. This is predicted to grow to $1.5 trillion by 2025, according to the United Nations Women’s Gender Snapshot 2022 report.
Being increasingly aware of the inequalities women face in access to technology, it is vital they are not left behind. For this reason, they need to be uplifted and empowered by examples of women achieving great results. Visible female role models can have a great impact in showcasing all of the possibilities to women who are passionate about technology and the potential to make a difference.
We want to kick off the Social Impact Tech series with our resident expert, Derya Sever, the NLP & Machine Learning Lead at Data Friendly Space (DFS). Derya holds a PhD in Operations Research (Industrial Engineering) from Eindhoven University of Technology (NL) and in her career, from manufacturing to delivery, she has crafted strategies for adaptive and transformational change to sustain project outcomes.
We discussed her experience, what is like being a woman in a male-dominant technology field, and her advice for women either entering or already in the field.
Women often face gender bias and discrimination in tech and makeup only 12% of leading machine learning researchers. Can you share any personal experiences of how you navigated these challenges to excel in your career?
Coming from an industrial engineering and humanitarian background, I have observed that as women, we are minorities in tech and STEM. I took the initiative to excel in machine learning by self-learning from online courses, reading research papers, and hands-on learning to solve humanitarian problems that we encounter in ML and data science. Fortunately, I always had support from the organizations that I work with. I try to join seminars and workshops to increase feminism in AI. We need to invest more to increase local AI know-how for women in countries where communities suffer the most.
As a woman in NLP engineering, what advice would you give to women trying to enter or transition into this field? What challenges did you face and how did you overcome them?
Unesco’s latest study shows that only 20% of employees in technical roles in machine learning companies, 12% of artificial intelligence researchers globally, and 6% of professional software developers are women. We face a gender gap in the AI space which might also have an impact on gender bias in the outcomes of AI-based decisions.
The challenge of being a woman in the AI space is to experience the lack of diversity, equality, and inclusion measures in the sector. When we join conferences or research, most of the speakers are male. Furthermore, especially in generative AI, we observe gender bias in the outcomes which continuously embeds the bias in media, governance, and everyday life in technology. Although now increasing, we see little effort in providing resources and capacity building of know-how for women, especially in low-resource countries. My advice to anyone entering or transitioning into the field is to explore all resources, join in, and create more Women in AI communities to tackle these challenges. The current gap should not make us take a step back but only encourage more admiration and courage to mitigate and eliminate gender-biased decisions. We should follow and build the advancements in technology as women. I also advise searching for funds for research and development for women in AI. Women should not hesitate to apply to AI job profiles even if you meet only part of the criteria as AI is a continuous learning process. We need to become advocates in our working environments for gender equality and inclusion to create learning capacities for women who have limited resources but are ambitious in AI Technology.
What do you believe are the most exciting aspects or trends in NLP right now and how do you think these can influence and change the social impact sector?
We are in an era where data and collaboration shape decisions and policies. With the current advances in NLP, especially LLMs, we can easily analyze and get insights from vast amounts of data from many resources, collaborate, and automate our tasks efficiently. The shift of paradigm in language models now allows us to perform NLP tasks without the need to train task-specific models on small task-specific datasets but rather using LLMs.
If used responsibly and ethically, NLP can lead to faster, more accurate, and more efficient operational decisions to improve the lives of communities in need. For instance, we can now extract impact, needs, and response information from reports, media, and social listening before and during humanitarian crises. Before it took weeks to respond and deliver assistance, however now we can achieve this in days. The impact of NLP is not limited to the humanitarian sector but also detecting trends and sentiments for climate crises to take anticipatory action. We can also use NLP for women, labor, and human rights advocacy to detect violations around the globe. However, I emphasize that we need to ensure that bias and ethical issues are mitigated, ensuring transparent and fair outcomes.
Can you share your experience with bias and fairness in NLP models? What steps have you taken to ensure more equitable and inclusive systems?
Intrinsic bias in the data is embedded and absorbed in NLP Models, especially in LLMs where the training corpus is unknown. Furthermore, the models also have intrinsic bias even mitigated after reinforcement learning with human feedback. Both dataset and algorithmic bias contribute to outputs and thus decision-making that reflects gender, race, or other forms of bias, and not only is it reflected, but amplified at higher speed. The current LLM mostly reflects Global North corpus and perspective, the lack of native language-based datasets creates one-sided story outputs. In our group, we work with humanitarian news and reports where mitigating bias is highly crucial in making sure that the communities in need are assisted with impartiality and neutrality. Gender and race bias neutralization models are implemented in our humanitarian dataset (HUMSET). We are further aiming to include native languages and diverse sources, mitigating and eliminating bias in data and models and postprocessing as well.
NLP means working with massive datasets. How do you approach data collection, cleaning, and preparation to ensure the best model performance?
For the NLP life cycle, ensuring quality, diversity, privacy, and security is crucial to have bias-mitigated and high-quality outcomes. Before diving into data acquisition, it is very important to define a data strategy with acceptance criteria for fairness, quality, security, diversity, etc. We should make sure to choose a secure data platform to store private data and minimize data infringement possibilities. Determining efficient data sources to mitigate bias, how to ensure privacy, and how to clean and structure them best for the model is one of our key pillars.
Another important practice is to define data readiness levels, e.g. accessibility, validity, and utility to facilitate communication during the development process. Depending on the NLP tasks, we should strive for high utility data where we answer whether the data is accessible, whether we can use or store the specific data in compliance with GDPR, how we can ensure privacy, which data is needed to solve the specific problem, whether the quality of data and annotations are good for the model, and if it is representative. For storage, we should also strive for appropriate pre-processing for the best performance of NLP models.
How do you consider what ethics to keep in mind when developing AI systems, particularly NLP, and how do you ensure responsible development?
In our AI strategy, implementing and ensuring ethical and responsible AI pillars is a must. The foundation of ethical AI systems should follow humanitarian principles (humanity, neutrality, impartiality, and independence) and also include the rights of nature, bearing in mind our footprint in the climate crisis while using AI systems. By mitigating bias and discrimination we should ensure no one is left behind in the AI outcomes. We need to focus on more local languages and voices from locally-led organizations. We need to be held accountable and take ownership of the decisions from the NLP models. Transparency and explainability are some of the goals for our NLP-based platforms and solutions. Users should understand why the model has come up with the specific solution so that human feedback is empowered. Security and privacy have to be maintained at the highest level considering our data architectures and cloud services.
The field of NLP is continuously evolving and changing. How do you stay up to date with the latest technologies and what resources do you use?
I follow NLP research papers and technology leads. I engage with newsletters, feeds from Twitter and Linkedin, and listen to podcasts to explore how we can adapt to our space to understand the challenges. I also enjoy joining webinars and conferences focused on humanitarian, development, and Generative AI technology advancements. In our NLP group every week we explore the advancements and challenges by listening to podcasts, reading, and analyzing research papers.
Continue to follow along as we explore social impact tech.