Saturday, January 11, 2025

Techniques for AI

We all know humans have a super powerful brain that can think, feel and predict things.

For instance: When we see a night sky full of stars, we can say it is going to be sunny the next day morning. This prediction is based on our years of experience.

While humans learn from experience, can computers do the same? - The answer is "yes", this is what is called machine learning.

Artificial intelligence is the study and application of many different techniques - let's see some of them.

Machine learning:

Machine learning is a technique that is used by machines to learn and improve their performance and predictions from their experience based on computational methods and algorithms. It is one of the most rapidly growing fields in technology.

Deep Learning:

This is a branch of machine learning. computers are programmed to automatically learn complicated concepts from multiple layers of representation by building them from simple ones.

Deep learning methods are inspired by human brains, where machines learn from data by using artificial neural networks.

Forms of Learning for machine model

Supervised learning:

In this kind, the machine undergoes trained data that is mostly guided by a human.

Unsupervised learning - But sometimes the data does not contain any historical information or may contain extra factors.

Reinforcement learning:

In this kind of learning the machine is programmed to learn by its experience with a way programming agent by reward and punishment without needing to specify how the task is to be achieved

Natural Language Processing (NLP):

NLP is a spectrum of theory-based computer techniques that focuses on enabling computers to understand, interpret, and respond to human language in a way that is both meaningful and contextually appropriate.

It uses computational techniques from various fields such as computer science, artificial intelligence, linguistics, and data science to enable computers to understand human language, both written and spoken.

For example - technology such as chatbots, speech recognition products like Amazon’s Alexa or Apple’s Siri, or Google's "Hey Google"

Computers are programmed to interpret and process human spoken language in the form of text or sound.

Subtypes of NLP

Natural Language Understanding (NLU):

NLU establishes an associated ideology, a data structure that specifies the relationships between words and phrases.

For example, I hit my left arm because I left my glasses on the table

In the sentence above, the word "left" in the first part of the sentence has a completely different meaning than the word "left" in the second part of the sentence. People easily distinguish between homonyms and homonyms, which alters the nuances of spoken language.

Translating human language into a computer-understandable representation is not trivial. Because in language the same group of words can have different meanings depending on the context.

Another example - the phrase "What?" when used with different emotions can mean different.

When a person in Shock - will say the phrase "What?" with eyes widely open.
When a person is confused - will say the phrase "What?" with eyebrows shrunk or tilted.
When a person is surprised - will say the phrase "What?" with his mouth open and wide eyes.

We, humans, can read facial expressions and phrases to understand a sentence with emotion - but for machines, this is still a challenge and needs more explanation of how to capture words with emotions.

This survey perspective is often used in data mining to understand customer feedback. Sentiment analysis allows brands to monitor customer feedback more closely and take corrective action.

Natural Language Generation (NLG):

This is another area of NLP in which machines are programmed to understand human language and produce human-readable responses in text form, which can also be connected to speech using the "Text-to-speech" conversion method.

NLG has the ability to generate an overview of its given input documents while maintaining the integrity of the entered information.

Example - ChatGPT

This area of study involves understanding techniques and developing new processes for machines to perform analysis and manipulation in human languages

Computer Vision:

Computer Vision also known as Machine Vision is the study of artificial intelligence that focuses on making machines interpret and understand visual data.

Its goal is to give computers the ability to extract high-level understanding from digital images and videos.

This might seem easy but for computers, it's not so, unlike humans machines do not have the gift of vision and perception

So for machines an image looks like a array of massive number of integers each representing a color and intensity.

Algorithms are designed to use machine learning in order to train the machine to understand an image.

Computer vision algorithms used today are based on Pattern Recognition and typically rely on Convolutional Neural Networks (CNNs). In this computers are trained with enormous about data using machine learning techniques to find patterns in the image, for example, to identify a face in an image.

A popular example using the case of Computer Vision is "Object Recognition", one of its applications is identification detection.

Robotics:

This is another branch of Artificial Intelligence that consists of studies to produce machines called Robots to substitute human actions, in another way you can say the end product of Robotics is called Robots that use programmable machines and mimic human actions.

Robotos were initially built to perform repetitive and monotonous tasks like "assembling cars" but today the area of their application has expanded, we even use Robots in our homes to vacuum.

The limitation of Robots so far has been that a Robot can perform the ONLY task for which it is programmed and nothing else. This field is expanding with more exploration of how to make robots deal with their environment and perform tasks like humans.

Conclusion :

In short, AI consists of several techniques of its applications, and each branch itself is vast and requires more exploration. AI has a huge impact on our lifestyle.

Considering the field of AI, this is definitely a very fast-growing and demanding field. As of 2023, If you are looking to build a career in computer science, then definitely there is no other field like AI. This is the most promising area.

Friday, July 26, 2024

What is Natural Language Processing (NLP)?

Natural Language Processing is the branch of Artificial Intelligence that gives machines the ability to read, understand, and derive meaning from human languages.

Definition :

NLP stands for Natural Language Processing. It is a field of artificial intelligence and computational linguistics that focuses on enabling computers to understand, interpret, and generate human language in a way that is both meaningful and valuable. NLP involves the interaction between computers and human language, encompassing tasks such as language understanding, language generation, language translation, sentiment analysis, and more.

NLP aims to bridge the gap between human communication and computer understanding, allowing machines to process and analyze large amounts of textual data, extract insights, and perform various language-related tasks

NLP combines the field of linguistic and computer science to decrypt language structure and guidelines and to make models that can comprehend, break down and separate significant details from text and speech

Humans interact with each other through various forms of mediums transferring vast amounts of data. This data is very useful in terms of understanding customer behavior and learning customer habits.

This data is mostly unstructured and data scientist uses this data to train machines, to understand human linguistics.

Understanding NLP :

NLP encompasses a wide range of tasks, from language understanding to language generation, and it forms the foundation for various applications that involve processing and analyzing text or speech data.

NLG (Natural Language Generation) and NLU (Natural Language Understanding) are two key components of NLP that focus on different aspects of working with human language.

Natural Language Generation (NLG): NLG is about generating human-like language from structured data or concepts.

Example of NLG: Imagine an e-commerce platform that generates product descriptions for various items based on their attributes. An NLG system could take information like product specifications, features, and customer reviews, and transform it into a well-written product description.

Natural Language Understanding (NLU): NLU is about extracting meaning and intent from unstructured human language input.

Example of NLU: Consider a virtual assistant like Siri or Google Assistant. When you ask, "What's the weather like today?", the NLU component of these assistants needs to understand your intent (asking for the weather) and extract relevant information (today's weather forecast) from your input. NLU helps the system grasp the user's intention and respond appropriately.

How NLP works:

NLP algorithm is a big chunk of several small algorithms. In normal condition, a machine is given unstructured data, like a simple sentence, which it needs to interpret and generate human-like answers. Let us understand how NLP works using our sample sentence above

Example sentence: There are townhouses available for rent in Nashville's downtown

1. Tokenization :

The first step in NLP involves preparing the raw text for analysis. This often includes tasks like tokenization (breaking text into words or subword units), removing punctuation, converting text to lowercase, and handling special characters. This step aims to create a structured representation of the text that can be easily processed.

In our case - our sentence will be tokenized as

"There", "are", "townhouses", "available", "for", "rent", "in", "Nashville's", "downtown", "."

During tokenization, the sentence is further neutralized by following below processes

1.1 Removal of Stop Words

At this stage, from the sentence all words which do not add much meaning to our sentences, are removed from the sentence. Some example of stop words are "and", "is", "are", "as", "the" etc..

In our case - our sentence after removing stop-words will look like

"townhouses", "available", "rent", "Nashville's", "downtown", "."

1.2. Stemming

The next step is to further normalize the sentence by identifying Stem words for each token. The machines are made built with a dictionary of every word which can mean the same because of it added prefixes or suffixes.

For example: "ran", "runs" and "running" - for these three different words a STEM word "run" will be defined.

BUT, stemming does not work for every token.

For example, the words "universal" and "university" would not stem from "universe". In such cases, the machine goes through the process of Lemmatization.

1.3 Lemmatization

In this case, the machine takes the token and looks for its meaning from a dictionary and then stems down to its root stem word.

2. Part-of-Speech Tagging:

In this step - each token is assigned a part-of-speech tag to indicate its grammatical role in the sentence:

In our case - our sentence looks like

"townhouses" (NOUN), "available" (ADJective), "rent" (NOUN), \ "Nashville's" (NOUN), "downtown" (NOUN)

3. Semantic Analysis:

The sentence's meaning is interpreted, taking into account the relationships established during syntactic analysis where The structure of the sentence is determined, identifying relationships between words. The sentence conveys that there are townhouses that can be rented in the downtown area of Nashville.

4. Named Entity Recognition (NER):

The NER step identifies named entities (people, places, organizations, etc.) in the sentence:

In case of our sentence:

"Nashville" is recognized as a location entity.

5. Contextual Understanding (Transformer Model):

If a transformer model like BERT or GPT is used, it would analyze the sentence bidirectionally, considering the context of all words. This contextual understanding helps the model grasp the subtle nuances and relationships in the sentence.

Applications of NLP

Language Translation: NLP powers the technology behind language translation services like Google Translate (https://translate.google.com/), allowing people to communicate across language barriers effortlessly.

Sentiment Analysis: Businesses use NLP to gauge public sentiment by analyzing social media posts, customer reviews, product feedback, and news articles, helping them make informed decisions and tailor their strategies.

Chatbots and Virtual Assistants: This is definitely a breakthrough, where now every time you call for assistance, you interact with robots instead of humans. NLP drives the conversational abilities for chatbots on websites, enhancing customer interactions and providing instant assistance.

Text Summarization: NLP algorithms are used to condense lengthy articles, documents, and reports into concise summaries, aiding in information extraction and quick comprehension. For- example Chat GPT

Speech Recognition: Voice-activated devices, like Siri, Alexa, and Hey-Google, are speech-to-text applications that rely on NLP for accurate speech recognition, enabling hands-free communication and transcription.

Named Entity Recognition: NLP helps identify and classify entities such as names of people, organizations, and locations in text, useful for information extraction and knowledge management.

Challenges in NLP

Despite its impressive capabilities, NLP faces several challenges:

Ambiguity of sentences.

Contextual Understanding

For example the word phrase “What?” when used with different emotions can mean different.

When a person in Shock – will say the phrase “What?” with eyes widely open.
When a person is confused – will say the phrase “What?” with eyebrows shrunk or tilted.
When a person is surprised – will say the phrase “What?” with his mouth open and wide eyes.

We, humans, can read facial expressions and phrases to understand a sentence with emotion – but for machines, this is still a challenge and needs more explanation of how to capture words with emotions.

But nevertheless, NLP is a fast-growing sector in AI and on-demand skill. One of the most significant breakthroughs in NLP is the development of transformer models, which have revolutionized the field. Models like BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer) have set new benchmarks in tasks like language understanding, machine translation, and text generation. These models leverage vast amounts of training data and extensive computational resources to achieve remarkable performance on a variety of NLP tasks.

NLP techniques often leverage machine learning approaches, including deep learning, recurrent neural networks (RNNs), convolutional neural networks (CNNs), and transformer models, like the ones used in the GPT series developed by OpenAI.

Conclusion

Natural Language Processing has brought us closer to the dream of seamless communication between humans and machines. Consider translating languages on the fly, or searching via speech to bring personalized customer experience in apps NLP has woven itself into the fabric of modern society.

Let us understand how a simple sentence is broken down to make a machine understand what it means.

Guide to learning software testing

The Software Testing field offers a wealth of opportunities for those wishing to build a successful career in the Software Industry.

Whether you're interested in software testing, quality assurance, or any other field, it's important to understand the steps you need to take to prepare for the testing role and improve your chances of getting the job. This blog post provides a comprehensive guide on how to start preparing for your role and ultimately being employed.

Types of Software Testing

Software Testing types can broadly be divided as

Manual testing - where you will do the testing of each and every page and feature manually.
Automation Testing - this implies testing certain pages and features using tools and also includes manual testing.

Nowadays learning only the manual aspect of software testing is not enough, as there is a lot of competition and saturation of jobs in software testing - so it is always suggested to build your skill in both manual and automation testing.

Well just to give you heads-up, automation testing is not hard to learn but it does require you to have development skills and you need to learn at least one coding language like Java, javascript, python, groovy, etc.

How to start with testing

1. START THE UNDERSTANDING -

starting with an understanding of the topic is very important for example :

What software testing is? - Simply it is a way to test code, in a way the end user is going to use this software and not in a way how dev writes the code. It is the mindset of the tester that makes the software testing perfect.
How does it work? - Software testing follows a definte process. It starts at the time the application is designed. The common steps involved are
Understanding requirement,
Creating testcases,
Identifying test cases eligible for automation,
Defining performance test thershold,
Defining test environment,
Execution of test,
Verifying and logging defects,
Creating report
Why it is needed? - As a user, if you find any issue with application you are interacting, you will immediately reject it and never use the app again. So you see as a business owner you would never want any big or small defect to be there when you publish your application to the world. Hence the need of the software being well tested is very important

2. KNOW THE REQUIRED SKILLS -

You need to know what all skillset you will need to perform when you are in the job responsibilities and the role that you will actually need to break into.

The most important skill as a quality assurance person is COMMUNICATION. Your communication skill counts a lot when you are being judged for QA role.

Communication skill can include:

Fluency in English while talking.
Skills on managing email communication.
Skill set on what to be logged when creating a defect.

A simple way you can say software is made with three layers -

ForntEnd
MiddleWare
Backend

In the case of software testing, you need to gain a skill set in all three areas of software testing - which

implies you need to learn :

How you can test FrontEnd of the software?
Skills and tools needed

Manual testing skill, web application using diffrent browser and/or mobile testing.
Tools like Jira, Confluence, ALM.
Excel sheet.
Automation testing tools like selenium.

How you can test MiddleWare of Software?
Skills and tools needed

RestFull APIs and gateways
Postman needed for API testing
Burp needed for security
Automation testing using Appium or RestAssured.

How you can test the Backend of the Software?
Skills and tools needed

Relational Database schema understanding
Queries in SQL, MySQL, Oracle
Linux commands
Basic cloud services knowledge like Azure, AWS and GCP.

Explore the areas and write down what all skills you need to learn:

What are the least must skill needed for software testing

In order to apply for a job the least mininum requirement that you should have in your CV is

SQL queries / CRUD (Create, retrieve, update and delete) operation in database.
API testing / Postman.
Manual Test case creation and defect logging process.
Excel sheet usability and how to use basic formulas.
Testing an applications on diffrent types of browsers.
Knowdlege on operating systems likes windows and McOS
Basic manual testing approach like Smoke testing, Regression testing, Usability testing, penetration testing

3. START LEARNING:

Now the question is how should you start learning software testing?

To start learning software testing, you do not need to spend money to learn there are a lot of courses available online:

1. Complete one course on manual software testing: You can follow any channel that you find easy to understand, but stick to it and complete the full course, some recommendations are :

Software Testing Tutorial by Software Testing Mentor
Manual Software Testing Training by SDET- QA Automation Techie

2. Complete one course on automation software testing

How To Learn Programming or Coding For Automation Testing - by Mukesh Otwani
Automation Software Testing - What is Selenium - by Raghav Automation Step by Step

3. Also you can purchase a book on software testing, some examples are :

Software Testing Techniques - written by Boris Beizer
Effective Software Testing: A developer's guide - written by Mauricio Aniche
Software Quality Assurance and Testing for Beginners - written by Nitin C Shah

4. Read Blogs:

you can follow the below blogs and learn concepts in software testing

StackExchange - They have weekly blogs coming out. You can also ask questions and get answers to any query related to software testing on their site.
Softwaretestingmaterial - They too have a weekly blog to follow.

Can I just learn manual testing and start applying for jobs?

Yes, you can, but keep in mind these days manual testing jobs are very rare to find. Industries are looking for people with automation experience.

Is software testing a well-paying job?

Yes, it is a well-paid job. In the US you can get anything from 60K to 150K annually which is quite good to get.

Can I become a developer after I start my career in software testing?

Yes again, you can switch to a development role even after you start your career in software testing, but remember you will need to gain a development skillset for that.

Switching can be easier for you in two ways

- You can switch a role within your own organization and so you won't have to look for a new company

- Since you are in IT already and working, you know the basic of how the software works and what developers do, this understanding will make it easy for you to learn coding.

To know what to do - check this blog

Conclusion :

So in short, I see going into software testing does require you to do some prep work and learn but it is not very hard. If you in this path of learning to be a tester or you are already a software tester let me know your thoughts below 👇.

Please subscribe to my blogs for weekly blogs on tech and on new concepts. :)

Saturday, April 15, 2023

CrossSite Scripting (XSS) - Security Check

In this post, we are going to explain in detail what exactly is CrossSite Scripting or XSS and why it is so important to remove any such vulnerability to your application.

Definition:

XSS:

Cross-site scripting or XSS is a technique where untrusted code is injected into web pages viewed by others. A payload containing the untrusted code is sent to the web application as user input, which is then received by an unsuspecting user and executed. Cross-site scripting vulnerabilities typically allow an attacker to impersonate an affected user. If the affected user has privileged access to the application, the attacker could gain complete control over all application functions and data.

Types of XSS:

The most common cause of XSS vulnerability is the ignorance of strict checks, testing, and validation of user input or dynamic content before saving to the server while building the application, allowing client-side JavaScript to be injected in a manner that will enable it to execute. There are three primary flavors of XSS

Reflected XSS: This kind of scripting is non-persistent, which means when the injected malicious script results show up or are immediately reflected by the user
Example:
The second XSS vulnerability was found in the Help menu located
at https://commandcenter.com/help/ddhelp/
ddhimpl/js/html/ddhelp.htm
On this page, the ""Search"" functionality is vulnerable to XSS.
Searching the following term would cause the script in the
""on focus"" attribute is to be executed repeatedly until the user
leaves the page.
===============================================================
"" onfocus=alert(document.cookie) autofocus
===============================================================
Stored XSS: This is persistent scripting which occurs when the data provided by the attack is stored on the server without adequate sanitization and then the malicious data is displayed on the web page to all users.
Example:

Adding information in a form input box - an attacker can type a malicious HTML in the box which should have contained the user's address
===============================================================
Address: 123 Strreet <iframe onload=alert(document.domain)></iframe>TX,US
===============================================================
DOM-Based XSS: This as the name implies the attack is at the client-side web browser generally done by containing client-side JavaScript that handles data from untrusted sources in an unsafe manner, typically writing data back to the DOM.
Example:
An application uses javascript in the dom to read from one input field and display value to another field
===============================================================
var name = document.getElementById('Name').value;
name = name + "exploit"
results.innerHTML = 'Your name submited as: ' + name;
===============================================================

How to know your application is XSS or CSRF vulnerable:

Testing your application for XSS and CSRF is an essential need of any development process. The simplest way to verify for CrossSite scripting can be described as follows:

Input coming into web applications is not validated before saving or processing.
Output to the browser is not HTML encoded.

METHODOLOGY

The assessment consisted of several phases, each detailed below along with the methodology, associated findings, and subsequent recommendations. GIS Vulnerability Management utilizes Penetration Testing Methodologies as the standard basis for penetration testing execution.

The standard can be found here: https://owasp.org/www-project-web-security-testing-guide/latest/3-The_OWASP_Testing_Framework/1- Penetration_Testing_Methodologies

Tools utilized are covered in the Penetration Testing Execution can be found here: https://owasp.org/www-project-web-security-testing-guide/v41/6-Appendix/A-Testing_Tools_Resource

VULNERABILITY ANALYSIS

Vulnerability identification and analysis is the process of discovering flaws in the hosted application. These issues could be anywhere from the application design to service misconfiguration and network misconfiguration as well.

MANUAL VERIFICATION

Automated scanning tools often fail to report some vulnerabilities. A testing methodology that solely relies on automated scan results can give a false sense of security. For vulnerabilities discovered through automated scanning, manual verification ensures report findings are accurate and that the vulnerabilities reported are an accurate representation of the environment.

Do manual entries of HTML, and Javascript codes into input boxes and makes the save. If the server saves the malicious content successfully then the application is vulnerable.
Use Burp or Postman tool to test API endpoints with malicious content. For example Enter the malicious content (script) that needs to be uploaded on the server by modifying the “Value” parameter while forwarding the request, i.e., “<iframe onload=alert(document.domain)></iframe>” in the parameter;
Enter the malicious content (script) that needs to be uploaded to the server by modifying the “Value” parameter while forwarding the request (there is no sanitization on the server side)., i.e., “<s>HTML-injection-Test</s>” in the parameter;

Implications / Consequences of not Fixing the Issue:

As a consequence, the malicious data will appear to be part of the website and run within the user’s

browser under the privileges of the web application.

This vulnerability can be used to conduct a number of browser-based attacks including:

Hijacking another user’s browser

Capturing sensitive information viewed by application users

Pseudo defacement of the application

Port scanning of internal hosts (“internal” in relation to the users of the web application)

Directed delivery of browser-based exploits

Other malicious activities

Suggested Countermeasures:

Using frameworks that automatically escape XSS by design, such as the latest Ruby on Rails, React JS.

Escaping untrusted HTTP request data based on the context in the HTML output (body, attribute, JavaScript, CSS, or URL) will resolve Reflected and Stored XSS vulnerabilities.
Applying context-sensitive encoding when modifying the browser document on the client side acts against DOM XSS.
Enabling a Content Security Policy (CSP) as a defense-in-depth mitigating control against XSS. It is effective if no other vulnerabilities exist that would allow placing malicious code via local file includes (e.g. path traversal overwrites or vulnerable libraries from permitted content delivery networks).
Validating and Sanitizing every input entered by the user before saving should always be considered of prime importance during building the application.
Check the impact of test input. Testers should analyze the results of the selected inputs and determine whether the discovered vulnerabilities impact the security of the application.

Conclusion:

CrossSite Scripting or XSS is avoidable vulnerability if design, implementation and testing of the application is done considering all factors of penetration testing.

Monday, April 18, 2022

How does search work?

Saturday, January 11, 2025